INN Hotels Project¶

Context¶

A significant number of hotel bookings are called-off due to cancellations or no-shows. The typical reasons for cancellations include change of plans, scheduling conflicts, etc. This is often made easier by the option to do so free of charge or preferably at a low cost which is beneficial to hotel guests but it is a less desirable and possibly revenue-diminishing factor for hotels to deal with. Such losses are particularly high on last-minute cancellations.

The new technologies involving online booking channels have dramatically changed customers’ booking possibilities and behavior. This adds a further dimension to the challenge of how hotels handle cancellations, which are no longer limited to traditional booking and guest characteristics.

The cancellation of bookings impact a hotel on various fronts:

  • Loss of resources (revenue) when the hotel cannot resell the room.
  • Additional costs of distribution channels by increasing commissions or paying for publicity to help sell these rooms.
  • Lowering prices last minute, so the hotel can resell a room, resulting in reducing the profit margin.
  • Human resources to make arrangements for the guests.

Objective¶

The increasing number of cancellations calls for a Machine Learning based solution that can help in predicting which booking is likely to be canceled. INN Hotels Group has a chain of hotels in Portugal, they are facing problems with the high number of booking cancellations and have reached out to your firm for data-driven solutions. You as a data scientist have to analyze the data provided to find which factors have a high influence on booking cancellations, build a predictive model that can predict which booking is going to be canceled in advance, and help in formulating profitable policies for cancellations and refunds.

Data Description¶

The data contains the different attributes of customers' booking details. The detailed data dictionary is given below.

Data Dictionary

  • Booking_ID: unique identifier of each booking
  • no_of_adults: Number of adults
  • no_of_children: Number of Children
  • no_of_weekend_nights: Number of weekend nights (Saturday or Sunday) the guest stayed or booked to stay at the hotel
  • no_of_week_nights: Number of week nights (Monday to Friday) the guest stayed or booked to stay at the hotel
  • type_of_meal_plan: Type of meal plan booked by the customer:
    • Not Selected – No meal plan selected
    • Meal Plan 1 – Breakfast
    • Meal Plan 2 – Half board (breakfast and one other meal)
    • Meal Plan 3 – Full board (breakfast, lunch, and dinner)
  • required_car_parking_space: Does the customer require a car parking space? (0 - No, 1- Yes)
  • room_type_reserved: Type of room reserved by the customer. The values are ciphered (encoded) by INN Hotels.
  • lead_time: Number of days between the date of booking and the arrival date
  • arrival_year: Year of arrival date
  • arrival_month: Month of arrival date
  • arrival_date: Date of the month
  • market_segment_type: Market segment designation.
  • repeated_guest: Is the customer a repeated guest? (0 - No, 1- Yes)
  • no_of_previous_cancellations: Number of previous bookings that were canceled by the customer prior to the current booking
  • no_of_previous_bookings_not_canceled: Number of previous bookings not canceled by the customer prior to the current booking
  • avg_price_per_room: Average price per day of the reservation; prices of the rooms are dynamic. (in euros)
  • no_of_special_requests: Total number of special requests made by the customer (e.g. high floor, view from the room, etc)
  • booking_status: Flag indicating if the booking was canceled or not.

Importing necessary libraries and data¶

In [7]:
# this will help in making the Python code more structured automatically (good coding practice)
!pip install black[jupyter] --quiet

import warnings

warnings.filterwarnings("ignore")
from statsmodels.tools.sm_exceptions import ConvergenceWarning

warnings.simplefilter("ignore", ConvergenceWarning)

from IPython.core.interactiveshell import InteractiveShell

InteractiveShell.ast_node_interactivity = "all"
from IPython.display import display
from matplotlib.ticker import MaxNLocator


# Libraries to help with reading and manipulating data
import pandas as pd
import numpy as np

# Library to split data
from sklearn.model_selection import train_test_split

# libaries to help with data visualization
import matplotlib.pyplot as plt
import seaborn as sns

# Removes the limit for the number of displayed columns
pd.set_option("display.max_columns", None)

# Sets the limit for the number of displayed rows
pd.set_option("display.max_rows", 200)


# To build model for prediction
import statsmodels.stats.api as sms

# to compute VIF
from statsmodels.stats.outliers_influence import variance_inflation_factor

# to build linear regression_model using statsmodels
import statsmodels.api as sm

from statsmodels.tools.tools import add_constant

# to build linear regression_model
from sklearn.linear_model import LinearRegression

# to build logistic regression_model
from sklearn.linear_model import LogisticRegression

# to check model performance
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

# To get diferent metric scores
from sklearn.metrics import (
    f1_score,
    accuracy_score,
    recall_score,
    precision_score,
    confusion_matrix,
    roc_auc_score,
    #plot_confusion_matrix,
    precision_recall_curve,
    roc_curve,
    make_scorer
)

custom = {"axes.edgecolor": "purple", "grid.linestyle": "solid", "grid.color": "black"}
sns.set_style("dark", rc=custom)

#format numeric data for easier readability
pd.set_option("display.float_format", lambda x: "{:.2f}".format(x)) # to display numbers rounded off to 2 decimal places

%matplotlib inline

# Libraries to build decision tree classifier
from sklearn.tree import DecisionTreeClassifier
from sklearn import tree

# To tune different models
from sklearn.model_selection import GridSearchCV

Loading the dataset¶

In [8]:
# let colab access my google drive
from google.colab import drive

drive.mount("/content/drive")
Mounted at /content/drive
In [9]:
# Loading the dataset - sheet_name parameter is used if there are multiple tabs in the excel file.
df = pd.read_csv("/content/drive/MyDrive/Python_Course/Project_4/INNHotelsGroup.csv")

Data Overview¶

  • Observations
  • Sanity checks

View the first and last 5 rows of the dataset.¶

In [10]:
df.head()
Out[10]:
Booking_ID no_of_adults no_of_children no_of_weekend_nights no_of_week_nights type_of_meal_plan required_car_parking_space room_type_reserved lead_time arrival_year arrival_month arrival_date market_segment_type repeated_guest no_of_previous_cancellations no_of_previous_bookings_not_canceled avg_price_per_room no_of_special_requests booking_status
0 INN00001 2 0 1 2 Meal Plan 1 0 Room_Type 1 224 2017 10 2 Offline 0 0 0 65.00 0 Not_Canceled
1 INN00002 2 0 2 3 Not Selected 0 Room_Type 1 5 2018 11 6 Online 0 0 0 106.68 1 Not_Canceled
2 INN00003 1 0 2 1 Meal Plan 1 0 Room_Type 1 1 2018 2 28 Online 0 0 0 60.00 0 Canceled
3 INN00004 2 0 0 2 Meal Plan 1 0 Room_Type 1 211 2018 5 20 Online 0 0 0 100.00 0 Canceled
4 INN00005 2 0 1 1 Not Selected 0 Room_Type 1 48 2018 4 11 Online 0 0 0 94.50 0 Canceled
In [11]:
df.tail()
Out[11]:
Booking_ID no_of_adults no_of_children no_of_weekend_nights no_of_week_nights type_of_meal_plan required_car_parking_space room_type_reserved lead_time arrival_year arrival_month arrival_date market_segment_type repeated_guest no_of_previous_cancellations no_of_previous_bookings_not_canceled avg_price_per_room no_of_special_requests booking_status
36270 INN36271 3 0 2 6 Meal Plan 1 0 Room_Type 4 85 2018 8 3 Online 0 0 0 167.80 1 Not_Canceled
36271 INN36272 2 0 1 3 Meal Plan 1 0 Room_Type 1 228 2018 10 17 Online 0 0 0 90.95 2 Canceled
36272 INN36273 2 0 2 6 Meal Plan 1 0 Room_Type 1 148 2018 7 1 Online 0 0 0 98.39 2 Not_Canceled
36273 INN36274 2 0 0 3 Not Selected 0 Room_Type 1 63 2018 4 21 Online 0 0 0 94.50 0 Canceled
36274 INN36275 2 0 1 2 Meal Plan 1 0 Room_Type 1 207 2018 12 30 Offline 0 0 0 161.67 0 Not_Canceled
  • The dataset contains information about the different attributes of customers' booking details.

Understand the shape of the dataset.¶

In [12]:
df.shape
Out[12]:
(36275, 19)

There are 36275 rows and 19 columns in the dataset.

Check the data types of the columns for the dataset.¶

In [13]:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 36275 entries, 0 to 36274
Data columns (total 19 columns):
 #   Column                                Non-Null Count  Dtype  
---  ------                                --------------  -----  
 0   Booking_ID                            36275 non-null  object 
 1   no_of_adults                          36275 non-null  int64  
 2   no_of_children                        36275 non-null  int64  
 3   no_of_weekend_nights                  36275 non-null  int64  
 4   no_of_week_nights                     36275 non-null  int64  
 5   type_of_meal_plan                     36275 non-null  object 
 6   required_car_parking_space            36275 non-null  int64  
 7   room_type_reserved                    36275 non-null  object 
 8   lead_time                             36275 non-null  int64  
 9   arrival_year                          36275 non-null  int64  
 10  arrival_month                         36275 non-null  int64  
 11  arrival_date                          36275 non-null  int64  
 12  market_segment_type                   36275 non-null  object 
 13  repeated_guest                        36275 non-null  int64  
 14  no_of_previous_cancellations          36275 non-null  int64  
 15  no_of_previous_bookings_not_canceled  36275 non-null  int64  
 16  avg_price_per_room                    36275 non-null  float64
 17  no_of_special_requests                36275 non-null  int64  
 18  booking_status                        36275 non-null  object 
dtypes: float64(1), int64(13), object(5)
memory usage: 5.3+ MB

There are 36275 rows and 19 columns in the data frame.
Booking_ID , type_of_meal_plan, room_type_reserved, market_segment_type, and booking_status are all objects. Should be updated to be categories.
no-of_adults, no_of_children, no-of_weekend_nights, no_of_week_nights, required_car_parking_space, lead_time, arrivaltime. arrival_month, arrival_date, repeated_guest, no_of_previous_calccellations, no_of_previous)bookings_not_cancelled, and no_of_special_requests are all integers.
avg_price_per_room is a float
Dependent variable is booking_status.
There is no missing data.

In [14]:
df.nunique()
Out[14]:
Booking_ID                              36275
no_of_adults                                5
no_of_children                              6
no_of_weekend_nights                        8
no_of_week_nights                          18
type_of_meal_plan                           4
required_car_parking_space                  2
room_type_reserved                          7
lead_time                                 352
arrival_year                                2
arrival_month                              12
arrival_date                               31
market_segment_type                         5
repeated_guest                              2
no_of_previous_cancellations                9
no_of_previous_bookings_not_canceled       59
avg_price_per_room                       3930
no_of_special_requests                      6
booking_status                              2
dtype: int64
In [15]:
# Copy data to avoid any changes to original date
df2 = df.copy()

Check the data types of the columns for the dataset.¶

In [16]:
# coverting "objects" to "category" reduces the data space required to store the dataframe
# converting type_of_meal_plan, and market_segment_type into categorical data
#not converting Booking_ID as it increases the memory
#not converting booking_status as I will be converting that to 0 or 1

for col in ["type_of_meal_plan", "market_segment_type","booking_status", "room_type_reserved"]:
   df2[col] = df2[col].astype("category")

# Use info() to print a concise summary of the DataFrame
df2.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 36275 entries, 0 to 36274
Data columns (total 19 columns):
 #   Column                                Non-Null Count  Dtype   
---  ------                                --------------  -----   
 0   Booking_ID                            36275 non-null  object  
 1   no_of_adults                          36275 non-null  int64   
 2   no_of_children                        36275 non-null  int64   
 3   no_of_weekend_nights                  36275 non-null  int64   
 4   no_of_week_nights                     36275 non-null  int64   
 5   type_of_meal_plan                     36275 non-null  category
 6   required_car_parking_space            36275 non-null  int64   
 7   room_type_reserved                    36275 non-null  category
 8   lead_time                             36275 non-null  int64   
 9   arrival_year                          36275 non-null  int64   
 10  arrival_month                         36275 non-null  int64   
 11  arrival_date                          36275 non-null  int64   
 12  market_segment_type                   36275 non-null  category
 13  repeated_guest                        36275 non-null  int64   
 14  no_of_previous_cancellations          36275 non-null  int64   
 15  no_of_previous_bookings_not_canceled  36275 non-null  int64   
 16  avg_price_per_room                    36275 non-null  float64 
 17  no_of_special_requests                36275 non-null  int64   
 18  booking_status                        36275 non-null  category
dtypes: category(4), float64(1), int64(13), object(1)
memory usage: 4.3+ MB
In [17]:
# Convert the category columns into objects:


# Identify categorical columns
categorical_cols = df2.select_dtypes(['category']).columns
categorical_cols

# Convert categorical columns to object
df2[categorical_cols] = df2[categorical_cols].astype('object')
Out[17]:
Index(['type_of_meal_plan', 'room_type_reserved', 'market_segment_type',
       'booking_status'],
      dtype='object')
In [18]:
df2.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 36275 entries, 0 to 36274
Data columns (total 19 columns):
 #   Column                                Non-Null Count  Dtype  
---  ------                                --------------  -----  
 0   Booking_ID                            36275 non-null  object 
 1   no_of_adults                          36275 non-null  int64  
 2   no_of_children                        36275 non-null  int64  
 3   no_of_weekend_nights                  36275 non-null  int64  
 4   no_of_week_nights                     36275 non-null  int64  
 5   type_of_meal_plan                     36275 non-null  object 
 6   required_car_parking_space            36275 non-null  int64  
 7   room_type_reserved                    36275 non-null  object 
 8   lead_time                             36275 non-null  int64  
 9   arrival_year                          36275 non-null  int64  
 10  arrival_month                         36275 non-null  int64  
 11  arrival_date                          36275 non-null  int64  
 12  market_segment_type                   36275 non-null  object 
 13  repeated_guest                        36275 non-null  int64  
 14  no_of_previous_cancellations          36275 non-null  int64  
 15  no_of_previous_bookings_not_canceled  36275 non-null  int64  
 16  avg_price_per_room                    36275 non-null  float64
 17  no_of_special_requests                36275 non-null  int64  
 18  booking_status                        36275 non-null  object 
dtypes: float64(1), int64(13), object(5)
memory usage: 5.3+ MB

All variable type are now correct.

Checking for duplicate values¶

In [19]:
df2.duplicated().sum()
Out[19]:
0

Checking for missing values¶

In [20]:
df2.isnull().sum()
Out[20]:
Booking_ID                              0
no_of_adults                            0
no_of_children                          0
no_of_weekend_nights                    0
no_of_week_nights                       0
type_of_meal_plan                       0
required_car_parking_space              0
room_type_reserved                      0
lead_time                               0
arrival_year                            0
arrival_month                           0
arrival_date                            0
market_segment_type                     0
repeated_guest                          0
no_of_previous_cancellations            0
no_of_previous_bookings_not_canceled    0
avg_price_per_room                      0
no_of_special_requests                  0
booking_status                          0
dtype: int64
In [21]:
# checking for unique values
df2.nunique()
Out[21]:
Booking_ID                              36275
no_of_adults                                5
no_of_children                              6
no_of_weekend_nights                        8
no_of_week_nights                          18
type_of_meal_plan                           4
required_car_parking_space                  2
room_type_reserved                          7
lead_time                                 352
arrival_year                                2
arrival_month                              12
arrival_date                               31
market_segment_type                         5
repeated_guest                              2
no_of_previous_cancellations                9
no_of_previous_bookings_not_canceled       59
avg_price_per_room                       3930
no_of_special_requests                      6
booking_status                              2
dtype: int64

Statistical summary¶

In [22]:
df2.describe().T
Out[22]:
count mean std min 25% 50% 75% max
no_of_adults 36275.00 1.84 0.52 0.00 2.00 2.00 2.00 4.00
no_of_children 36275.00 0.11 0.40 0.00 0.00 0.00 0.00 10.00
no_of_weekend_nights 36275.00 0.81 0.87 0.00 0.00 1.00 2.00 7.00
no_of_week_nights 36275.00 2.20 1.41 0.00 1.00 2.00 3.00 17.00
required_car_parking_space 36275.00 0.03 0.17 0.00 0.00 0.00 0.00 1.00
lead_time 36275.00 85.23 85.93 0.00 17.00 57.00 126.00 443.00
arrival_year 36275.00 2017.82 0.38 2017.00 2018.00 2018.00 2018.00 2018.00
arrival_month 36275.00 7.42 3.07 1.00 5.00 8.00 10.00 12.00
arrival_date 36275.00 15.60 8.74 1.00 8.00 16.00 23.00 31.00
repeated_guest 36275.00 0.03 0.16 0.00 0.00 0.00 0.00 1.00
no_of_previous_cancellations 36275.00 0.02 0.37 0.00 0.00 0.00 0.00 13.00
no_of_previous_bookings_not_canceled 36275.00 0.15 1.75 0.00 0.00 0.00 0.00 58.00
avg_price_per_room 36275.00 103.42 35.09 0.00 80.30 99.45 120.00 540.00
no_of_special_requests 36275.00 0.62 0.79 0.00 0.00 0.00 1.00 5.00

Average adults is between 1 and 2.
Most people do not bring children.
Average no_of_weekend_nights is less than 1.
Average no_of_week_nights is ~2
Lead_time is betwwen 0 and 443 days with the average being ~85 days.
arrival_year is between 2017 and 2018.

Exploratory Data Analysis (EDA)¶

  • EDA is an important part of any project involving data.
  • It is important to investigate and understand the data better before building a model with it.
  • A few questions have been mentioned below which will help you approach the analysis in the right manner and generate insights from the data.
  • A thorough analysis of the data, in addition to the questions mentioned below, should be done.

Leading Questions:

  1. What are the busiest months in the hotel?
  2. Which market segment do most of the guests come from?
  3. Hotel rates are dynamic and change according to demand and customer demographics. What are the differences in room prices in different market segments?
  4. What percentage of bookings are canceled?
  5. Repeating guests are the guests who stay in the hotel often and are important to brand equity. What percentage of repeating guests cancel?
  6. Many guests have special requirements when booking a hotel room. Do these requirements affect booking cancellation?

Univariate Analysis¶

In [23]:
# function to plot a boxplot and a histogram along the same scale.
def histogram_boxplot(df2, feature, figsize=(12, 7), kde=False, bins=None):
    """
    Creates a combined boxplot and histogram for a given feature in the dataset.

    Args:
        df2: The input dataframe.
        feature (str): The column name for which to create the plot.
        figsize (tuple, optional): Size of the figure (default: (12, 7)).
        kde (bool, optional): Whether to show the density curve (default: False).
        bins (int, optional): Number of bins for the histogram (default: None).

    Returns:
        None (displays the plot)
    """
    fig, (ax_box, ax_hist) = plt.subplots(
        nrows=2,
        sharex=True,
        figsize=figsize,
        gridspec_kw={"height_ratios": (0.25, 0.75)},
    )

    # Boxplot
    sns.boxplot(data=df2, x=feature, ax=ax_box, showmeans=True, color="#F72585")

    # Histogram
    if bins is None:
        unique_values = df2[feature].unique()
        bins = np.linspace(unique_values.min() - 1, unique_values.max() + 2, num=25)

    sns.histplot(data=df2, x=feature, bins=bins, kde=True, ax=ax_hist)

    # Add mean and median lines
    ax_hist.axvline(df2[feature].mean(), color="purple", linestyle="--", label="Mean")
    ax_hist.axvline(df2[feature].median(), color="blue", linestyle="-", label="Median")

    # Label each bar with its count
    for j, p in enumerate(ax_hist.patches):
        ax_hist.annotate(
            f"{int(p.get_height())}",
            (p.get_x() + p.get_width() / 2.0, p.get_height()),
            ha="center",
            va="center",
            xytext=(1, 10),
            textcoords="offset points",
        )

    ax_hist.legend()
    ax_hist.set_xlabel(feature)
    ax_hist.set_ylabel("Frequency")
    ax_hist.set_title(f"Frequency of {feature}")

    plt.tight_layout()
In [24]:
# function to create labeled barplots


def labeled_barplot(df2, feature, perc=False, n=None):
    """
    Barplot with percentage at the top
    df2: dataframe
    feature: dataframe column
    perc: whether to display percentages instead of count (default is False)
    n: displays the top n category levels (default is None, i.e., display all levels)
    """
    total = len(df2[feature])  # length of the column
    count = df2[feature].nunique()

    if n is None:
        plt.figure(figsize=(count + 1, 10))
    else:
        plt.figure(figsize=(n + 1, 10))

    plt.xticks(rotation=90, fontsize=15)
    ax = sns.countplot(
        data=df2,
        x=feature,  # Assign the x variable to hue
        palette="cubehelix",  # Set the hue to the same variable
        legend=False,  # Disable the legend
        order=df2[feature].value_counts().index[:n].sort_values(),
    )

    # Annotate each bar with its count and percentage
    for p in ax.patches:
        prc = "{:.1f}%".format(100.0 * p.get_height() / total)  # percentage
        cnt = p.get_height()  # count
        xx = p.get_x() + p.get_width() / 2  # x coordinate of bar percentage label
        yy = p.get_height()  # y coordinate of bar percentage label

        # Annotate percentage
        ax.annotate(
            prc,
            (xx, yy),
            ha="center",
            va="center",
            style="italic",
            size=12,
            xytext=(0, 10),
            textcoords="offset points",
        )

        # Annotate count (adjust vertical position)
        ax.annotate(
            cnt,
            (xx, yy + 100),
            ha="center",
            va="bottom",  # Adjusted to display above the percentage label
            size=12,
            xytext=(0, 20),
            textcoords="offset points",
        )

    # Increase y-axis size by 500
    plt.ylim(0, ax.get_ylim()[1] + 500)
In [25]:
def stacked_barplot(df2, predictor, target, palette=None):
    """
    Print the category counts and plot a stacked bar chart
    data: dataframe
    predictor: independent variable
    target: target variable
    palette: list of colors (optional)
    """
    count = df2[predictor].nunique()
    sorter = df2[target].value_counts().index[-1]

    # Use a custom palette or default to Matplotlib's default colors
    if palette:
        colors = palette
    else:
        # Default colors (you can replace these with your own)
        colors = ["#06C2AC", "#9A0EEA", "#ED0DD9", "#0000BB", "#DC143C"]
        #Colors are Teal, Violet, Fuchsia, Navy, and Crimson

    tab1 = pd.crosstab(df2[predictor], df2[target], margins=True).sort_values(
        by=sorter, ascending=False
    )
    print(tab1)
    print("-" * 120)

    tab = pd.crosstab(df2[predictor], df2[target], normalize="index").sort_values(
        by=sorter, ascending=False
    )

    # Plot using the specified colors
    tab.plot(kind="bar", stacked=True, figsize=(count + 5, 5), color=colors)

    plt.legend(loc="lower left", frameon=False)
    plt.legend(loc="upper left", bbox_to_anchor=(1, 1))
    plt.show()
In [26]:
### function to plot distributions wrt target


def distribution_plot_wrt_target(df2, predictor, target):

    fig, axs = plt.subplots(2, 2, figsize=(15, 10))

    target_uniq = df2[target].unique()

    axs[0, 0].set_title("Distribution of target for target=" + str(target_uniq[0]))
    sns.histplot(
        data=df2[df2[target] == target_uniq[0]],
        x=predictor,
        kde=True,
        ax=axs[0, 0],
        color="aqua",
        stat="density",
    )

    axs[0, 1].set_title("Distribution of target for target=" + str(target_uniq[1]))
    sns.histplot(
        data=df2[df2[target] == target_uniq[1]],
        x=predictor,
        kde=True,
        ax=axs[0, 1],
        color="indigo",
        stat="density",
    )

    axs[1, 0].set_title("Boxplot w.r.t target")
    sns.boxplot(data=df2, x=target, y=predictor, ax=axs[1, 0], palette="gist_rainbow")

    axs[1, 1].set_title("Boxplot (without outliers) w.r.t target")
    sns.boxplot(
        data=df2,
        x=target,
        y=predictor,
        ax=axs[1, 1],
        showfliers=False,
        palette="plasma",
    )

    plt.tight_layout()
    plt.show()
In [27]:
# Create a figure with a specified size
plt.figure(figsize=(20, 6))

# Plot the histogram and boxplot
histogram_boxplot(df2, "no_of_adults")

# Set the x-axis label
plt.xlabel("No of Adults per Booking")

df2["no_of_adults"].value_counts()
print()
df2["no_of_adults"].describe().T
Out[27]:
<Figure size 2000x600 with 0 Axes>
Out[27]:
Text(0.5, 47.722222222222285, 'No of Adults per Booking')
Out[27]:
no_of_adults
2    26108
1     7695
3     2317
0      139
4       16
Name: count, dtype: int64

Out[27]:
count   36275.00
mean        1.84
std         0.52
min         0.00
25%         2.00
50%         2.00
75%         2.00
max         4.00
Name: no_of_adults, dtype: float64
<Figure size 2000x600 with 0 Axes>

Most of the bookings have 2 adults.
139 are showing zero adults. These should be researched further.
Most INN hotels do not allow kids under 16 to stay without an adult.
If we were supplied the age we would be able to determine if the 0 were correct or needed to be replace by the mean.
Without that additional information we will leave the 0s, they all are showing children staying. None are zero adults and zero children.

In [28]:
# Create a figure with a specified size
plt.figure(figsize=(20, 6))

# Plot the histogram and boxplot
histogram_boxplot(df2, "no_of_children")

# Set the x-axis label
plt.xlabel("No of Children per Booking")

df2["no_of_children"].value_counts()
print()
df2["no_of_children"].describe().T
Out[28]:
<Figure size 2000x600 with 0 Axes>
Out[28]:
Text(0.5, 47.722222222222285, 'No of Children per Booking')
Out[28]:
no_of_children
0     33577
1      1618
2      1058
3        19
9         2
10        1
Name: count, dtype: int64

Out[28]:
count   36275.00
mean        0.11
std         0.40
min         0.00
25%         0.00
50%         0.00
75%         0.00
max        10.00
Name: no_of_children, dtype: float64
<Figure size 2000x600 with 0 Axes>

25%, 50%, and 75% are zero.
Most adults book without children.
The max amount of students is 10. The count of 8 to 10 are outlier that should be removed.
These outliers do not happen much so they could skew the results.

In [29]:
# Create a figure with a specified size
plt.figure(figsize=(20, 6))

# Plot the histogram and boxplot
histogram_boxplot(df2, "no_of_weekend_nights")

# Set the x-axis label
plt.xlabel("No of Weekend Nights per Booking")

df2["no_of_weekend_nights"].value_counts()
print()
df2["no_of_weekend_nights"].describe().T
Out[29]:
<Figure size 2000x600 with 0 Axes>
Out[29]:
Text(0.5, 47.722222222222285, 'No of Weekend Nights per Booking')
Out[29]:
no_of_weekend_nights
0    16872
1     9995
2     9071
3      153
4      129
5       34
6       20
7        1
Name: count, dtype: int64

Out[29]:
count   36275.00
mean        0.81
std         0.87
min         0.00
25%         0.00
50%         1.00
75%         2.00
max         7.00
Name: no_of_weekend_nights, dtype: float64
<Figure size 2000x600 with 0 Axes>

Most of the bookings are between 0 and 2. With the majority being zero
Average bookings include 0 to 1 weekend night.

In [30]:
# Create a figure with a specified size
plt.figure(figsize=(20, 6))

# Plot the histogram and boxplot
histogram_boxplot(df2, "no_of_week_nights")

# Set the x-axis label
plt.xlabel("No of Week Nights per Booking")

df2["no_of_week_nights"].value_counts()
print()
df2["no_of_week_nights"].describe().T
Out[30]:
<Figure size 2000x600 with 0 Axes>
Out[30]:
Text(0.5, 47.722222222222285, 'No of Week Nights per Booking')
Out[30]:
no_of_week_nights
2     11444
1      9488
3      7839
4      2990
0      2387
5      1614
6       189
7       113
10       62
8        62
9        34
11       17
15       10
12        9
14        7
13        5
17        3
16        2
Name: count, dtype: int64

Out[30]:
count   36275.00
mean        2.20
std         1.41
min         0.00
25%         1.00
50%         2.00
75%         3.00
max        17.00
Name: no_of_week_nights, dtype: float64
<Figure size 2000x600 with 0 Axes>

Max number of week nights is 17.
Average number of week nights is 2.
Most of the count of week nights is between O and 3.

In [31]:
# Create a figure with a specified size
plt.figure(figsize=(20, 6))

# Plot the histogram and boxplot
histogram_boxplot(df2, "required_car_parking_space")

# Set the x-axis label
plt.xlabel("Required Car Parking Space per Booking")

df2["required_car_parking_space"].value_counts()
print()
df2["required_car_parking_space"].describe().T
Out[31]:
<Figure size 2000x600 with 0 Axes>
Out[31]:
Text(0.5, 47.722222222222285, 'Required Car Parking Space per Booking')
Out[31]:
required_car_parking_space
0    35151
1     1124
Name: count, dtype: int64

Out[31]:
count   36275.00
mean        0.03
std         0.17
min         0.00
25%         0.00
50%         0.00
75%         0.00
max         1.00
Name: required_car_parking_space, dtype: float64
<Figure size 2000x600 with 0 Axes>

Most people do not required a car parking space.
Out of all the bookings on 1124 asked for a parking space.

In [32]:
# Create a figure with a specified size
plt.figure(figsize=(20, 6))

# Plot the histogram and boxplot
histogram_boxplot(df2, "lead_time")

# Set the x-axis label
plt.xlabel("Lead Time per Booking")

df2["lead_time"].value_counts()
print()
df2["lead_time"].describe().T
Out[32]:
<Figure size 2000x600 with 0 Axes>
Out[32]:
Text(0.5, 47.722222222222285, 'Lead Time per Booking')
Out[32]:
lead_time
0      1297
1      1078
2       643
3       630
4       628
       ... 
300       1
353       1
328       1
352       1
351       1
Name: count, Length: 352, dtype: int64

Out[32]:
count   36275.00
mean       85.23
std        85.93
min         0.00
25%        17.00
50%        57.00
75%       126.00
max       443.00
Name: lead_time, dtype: float64
<Figure size 2000x600 with 0 Axes>

Average lead time is 85 days.
Most people book their rooms less than 400 days before their stay.
Over 25% of people book their stay 20 days or less before their stay. Over 75% of people book their stay less than 1/2 year before their stay.

In [33]:
# Create a figure with a specified size
plt.figure(figsize=(20, 6))

# Plot the histogram and boxplot
histogram_boxplot(df2, "arrival_year")

# Set the x-axis label
plt.xlabel("Arrival Year per Booking")

df2["arrival_year"].value_counts()
print()
df2["arrival_year"].describe().T
Out[33]:
<Figure size 2000x600 with 0 Axes>
Out[33]:
Text(0.5, 47.722222222222285, 'Arrival Year per Booking')
Out[33]:
arrival_year
2018    29761
2017     6514
Name: count, dtype: int64

Out[33]:
count   36275.00
mean     2017.82
std         0.38
min      2017.00
25%      2018.00
50%      2018.00
75%      2018.00
max      2018.00
Name: arrival_year, dtype: float64
<Figure size 2000x600 with 0 Axes>

Most booking took place in 2018.
18% of all booking were in 2017.
82% of all bookings were in 2018.

In [34]:
# Create a figure with a specified size
plt.figure(figsize=(20, 6))

# Plot the histogram and boxplot
histogram_boxplot(df2, "arrival_month")

# Set the x-axis label
plt.xlabel("Arrival Month per Booking")

df2["arrival_month"].value_counts()
print()
df2["arrival_month"].describe().T
Out[34]:
<Figure size 2000x600 with 0 Axes>
Out[34]:
Text(0.5, 47.722222222222285, 'Arrival Month per Booking')
Out[34]:
arrival_month
10    5317
9     4611
8     3813
6     3203
12    3021
11    2980
7     2920
4     2736
5     2598
3     2358
2     1704
1     1014
Name: count, dtype: int64

Out[34]:
count   36275.00
mean        7.42
std         3.07
min         1.00
25%         5.00
50%         8.00
75%        10.00
max        12.00
Name: arrival_month, dtype: float64
<Figure size 2000x600 with 0 Axes>

The most bookings took place in month 10 (October).
The fewest bookings took place in January. Winter had 5,739 bookings. That accounts for 16% of the bookings.
Spring had 7,692 bookings. That accounts for 21% of the bookings.
Summer had 9,936 bookings. That accounts for 27% of the bookings.
Fall had 12,908 bookings. That accounts for 36% of the bookings.

In [35]:
#group by *arrival month*, count number of records per month, sort from most to fewest bookings, and show top 3 months
df2.groupby('arrival_month').count().sort_values(by='booking_status', ascending=False)['booking_status'].head(3)
Out[35]:
arrival_month
10    5317
9     4611
8     3813
Name: booking_status, dtype: int64

1. What are the busiest months in the hotel.

The top 3 months for bookings are:

  1. October
  2. September
  3. August

Fall (September - November) is the most popular season to book a hotel room.

In [36]:
# Create a figure with a specified size
plt.figure(figsize=(25, 10))

# Plot the histogram and boxplot
histogram_boxplot(df2, "arrival_date")

# Set the x-axis label
plt.xlabel("Arrival Date per Booking")

df2["arrival_date"].value_counts()
print()
df2["arrival_date"].describe().T
Out[36]:
<Figure size 2500x1000 with 0 Axes>
Out[36]:
Text(0.5, 47.722222222222285, 'Arrival Date per Booking')
Out[36]:
arrival_date
13    1358
17    1345
2     1331
4     1327
19    1327
16    1306
20    1281
15    1273
6     1273
18    1260
14    1242
30    1216
12    1204
8     1198
29    1190
21    1158
5     1154
26    1146
25    1146
1     1133
9     1130
28    1129
7     1110
24    1103
11    1098
3     1098
10    1089
27    1059
22    1023
23     990
31     578
Name: count, dtype: int64

Out[36]:
count   36275.00
mean       15.60
std         8.74
min         1.00
25%         8.00
50%        16.00
75%        23.00
max        31.00
Name: arrival_date, dtype: float64
<Figure size 2500x1000 with 0 Axes>

11843 books have an arrival date of 1st - 10th of the month. ~33% of all bookings.
12694 books have an arrival date of 11th - 20th of the month. ~35% of all bookings.
11738 books have an arrival date of 21st - 31st of the month. ~32% of all bookings.

In [37]:
# Create a figure with a specified size
plt.figure(figsize=(25, 10))

# Plot the histogram and boxplot
histogram_boxplot(df2, "repeated_guest")

# Set the x-axis label
plt.xlabel("Repeated Guest per Booking")

df2["repeated_guest"].value_counts()
print()
df2["repeated_guest"].describe().T
Out[37]:
<Figure size 2500x1000 with 0 Axes>
Out[37]:
Text(0.5, 47.722222222222285, 'Repeated Guest per Booking')
Out[37]:
repeated_guest
0    35345
1      930
Name: count, dtype: int64

Out[37]:
count   36275.00
mean        0.03
std         0.16
min         0.00
25%         0.00
50%         0.00
75%         0.00
max         1.00
Name: repeated_guest, dtype: float64
<Figure size 2500x1000 with 0 Axes>

Most bookings are not repeated guests.
Only 930 bookings are from repeated guests.
More ressearch should be done to determine why more guests are not booking additional stays with the hotels.

In [38]:
# Create a figure with a specified size
plt.figure(figsize=(25, 10))

# Plot the histogram and boxplot
histogram_boxplot(df2, "no_of_previous_cancellations")

# Set the x-axis label
plt.xlabel("Number of Previous Cancellations per Booking")

df2["no_of_previous_cancellations"].value_counts()
print()
df2["no_of_previous_cancellations"].describe().T
Out[38]:
<Figure size 2500x1000 with 0 Axes>
Out[38]:
Text(0.5, 47.722222222222285, 'Number of Previous Cancellations per Booking')
Out[38]:
no_of_previous_cancellations
0     35937
1       198
2        46
3        43
11       25
5        11
4        10
13        4
6         1
Name: count, dtype: int64

Out[38]:
count   36275.00
mean        0.02
std         0.37
min         0.00
25%         0.00
50%         0.00
75%         0.00
max        13.00
Name: no_of_previous_cancellations, dtype: float64
<Figure size 2500x1000 with 0 Axes>

Most bookings are not cancelled.

In [39]:
# Create a figure with a specified size
plt.figure(figsize=(25, 10))

# Plot the histogram and boxplot
histogram_boxplot(df2, "no_of_previous_bookings_not_canceled")

# Set the x-axis label
plt.xlabel("Number of Previous Bookings Not Canceled per Booking")

df2["no_of_previous_bookings_not_canceled"].value_counts()
print()
df2["no_of_previous_bookings_not_canceled"].describe().T
Out[39]:
<Figure size 2500x1000 with 0 Axes>
Out[39]:
Text(0.5, 47.722222222222285, 'Number of Previous Bookings Not Canceled per Booking')
Out[39]:
no_of_previous_bookings_not_canceled
0     35463
1       228
2       112
3        80
4        65
5        60
6        36
7        24
8        23
10       19
9        19
11       15
12       12
14        9
15        8
16        7
13        7
18        6
20        6
21        6
17        6
19        6
22        6
25        3
27        3
24        3
23        3
44        2
29        2
48        2
28        2
30        2
32        2
31        2
26        2
46        1
55        1
45        1
57        1
53        1
54        1
58        1
41        1
40        1
43        1
35        1
50        1
56        1
33        1
37        1
42        1
51        1
38        1
34        1
39        1
52        1
49        1
47        1
36        1
Name: count, dtype: int64

Out[39]:
count   36275.00
mean        0.15
std         1.75
min         0.00
25%         0.00
50%         0.00
75%         0.00
max        58.00
Name: no_of_previous_bookings_not_canceled, dtype: float64
<Figure size 2500x1000 with 0 Axes>

Per the graph most previous booking have not been cancelled.

In [40]:
# Create a figure with a specified size
plt.figure(figsize=(25, 10))

# Plot the histogram and boxplot
histogram_boxplot(df2, "avg_price_per_room")

# Set the x-axis label
plt.xlabel("Average Price Per Room per Booking")

df2["avg_price_per_room"].value_counts()
print()
df2["avg_price_per_room"].describe().T
Out[40]:
<Figure size 2500x1000 with 0 Axes>
Out[40]:
Text(0.5, 47.722222222222285, 'Average Price Per Room per Booking')
Out[40]:
avg_price_per_room
65.00     848
75.00     826
90.00     703
95.00     669
115.00    662
         ... 
212.42      1
83.48       1
70.42       1
130.99      1
167.80      1
Name: count, Length: 3930, dtype: int64

Out[40]:
count   36275.00
mean      103.42
std        35.09
min         0.00
25%        80.30
50%        99.45
75%       120.00
max       540.00
Name: avg_price_per_room, dtype: float64
<Figure size 2500x1000 with 0 Axes>

The room rate averages around 100.
The 627 that are showing a rate of 20 or less are free or discounted rooms.
Most rooms are less than 250.
75% of all rooms cost 120 or less.

In [41]:
# Create a figure with a specified size
plt.figure(figsize=(25, 10))

# Plot the histogram and boxplot
histogram_boxplot(df2, "no_of_special_requests")

# Set the x-axis label
plt.xlabel("Number of Special Requests per Booking")

df2["no_of_special_requests"].value_counts()
print()
df2["no_of_special_requests"].describe().T
Out[41]:
<Figure size 2500x1000 with 0 Axes>
Out[41]:
Text(0.5, 47.722222222222285, 'Number of Special Requests per Booking')
Out[41]:
no_of_special_requests
0    19777
1    11373
2     4364
3      675
4       78
5        8
Name: count, dtype: int64

Out[41]:
count   36275.00
mean        0.62
std         0.79
min         0.00
25%         0.00
50%         0.00
75%         1.00
max         5.00
Name: no_of_special_requests, dtype: float64
<Figure size 2500x1000 with 0 Axes>

Most bookings have 1 or less special requests.
The most special requests is 5.

Categorical Variables¶

In [42]:
# Labeled barplot for type of meal plan
labeled_barplot(df, "type_of_meal_plan", perc=True, n=25)

Most guests are picking meal plan 1. 76.7% of all guest chose this plan.
The next biggest group is the guests that chose not to have a meal plan. They make up 14.1% of all bookings.

In [43]:
# Labeled barplot for room type reserved
labeled_barplot(df, "room_type_reserved", perc=True, n=25)

Most guests are reserving Room Type 1. 77.5% of all guests reserved this type of room.
Next popular room type is Room Type 4. 16.7% of all guests reserved this type of room.
Room Type 3 is the least popular room. Only 7 guests booked this type of room.

In [44]:
# Labeled barplot for market segment
labeled_barplot(df, "market_segment_type", perc=True, n=25)

The most popular segment of our guests is online which accounts for 64% of all bookings.
Next popular segment is offline which accounts for 29% of all bookings.
Corporate guests only account for 5.6% of all bookings.
Complimentary or avaiation guests only account for 1.4% of all bookings.

In [45]:
df2.groupby('market_segment_type').count().sort_values(by='booking_status', ascending=False)['booking_status']
Out[45]:
market_segment_type
Online           23214
Offline          10528
Corporate         2017
Complementary      391
Aviation           125
Name: booking_status, dtype: int64

2. Which market segment do most of the guest come from?

  1. Online 23,214 64%
  2. Offline 10,528 29%
  3. Corporate 2,017 5.6%
  4. Complementary 391 1.1%
  5. Aviation 125 0.3%
In [46]:
# Labeled barplot for booking_status
labeled_barplot(df, "booking_status", perc=True, n=25)

11,885 bookings 32.8% are canceled.
24,390 bookings 67.2% are not canceled.

Bivariate Analysis:¶

Correlation Check¶

In [47]:
heatmap_list = df2.select_dtypes(include=np.number).columns.tolist()
# dropping release_year as it is a temporal variable.

plt.figure(figsize=(15, 7))
sns.heatmap(
    df2[heatmap_list].corr(), annot=True, vmin=-1, vmax=1, fmt=".2f", cmap="hsv"
)
plt.show()
Out[47]:
<Figure size 1500x700 with 0 Axes>
Out[47]:
<Axes: >

no_of_previous_bookings_not_canceled and repeated_guest have a 0.54 correlation.
Obviously no_of_previous_bookings_not_cancelled is related to no_of_previous_cancellations. They have a 47% rate

Market Segment compared to Avg Price Per Room

3. Hotel rooms are dynamic and change according to demand & customer demographics.
What is difference in room prices in different market segments.

In [48]:
df2.groupby('market_segment_type').agg({'avg_price_per_room':'mean'}).sort_values(by='avg_price_per_room',ascending=False).reset_index()
Out[48]:
market_segment_type avg_price_per_room
0 Online 112.26
1 Aviation 100.70
2 Offline 91.63
3 Corporate 82.91
4 Complementary 3.14

Online has the highest average price at 112.26.
As expected Complimentarty is the lowest average price of 3.14. They average between 0 and 20 usually per booking.
Corporate rates are the next lowest at a average of 82.91.

In [49]:
df2['booking_status'].value_counts()
Out[49]:
booking_status
Not_Canceled    24390
Canceled        11885
Name: count, dtype: int64

4. What percent of bookings are canceled?

Out of all bookings, 11,885 bookings 32.8% are canceled.
24,390 bookings 67.2% are not canceled.

Repeated Guests compared to Booking Status

5. Repeating guest are the guest who stay in the hotel often and are important to brand equity.
What percent of repeating guests cancel?

In [50]:
df2.groupby('repeated_guest')['booking_status'].value_counts()
Out[50]:
repeated_guest  booking_status
0               Not_Canceled      23476
                Canceled          11869
1               Not_Canceled        914
                Canceled             16
Name: count, dtype: int64
In [51]:
stacked_barplot(df, "repeated_guest", "booking_status")
booking_status  Canceled  Not_Canceled    All
repeated_guest                               
All                11885         24390  36275
0                  11869         23476  35345
1                     16           914    930
------------------------------------------------------------------------------------------------------------------------
In [52]:
distribution_plot_wrt_target(df2, "repeated_guest", "booking_status")

Repeated guests cancel less than guests who are not repeat guests.
Out of 930 booking by repeat guests only 16 cancelled. That is only 1.7% of all their bookings.
Bookings made by non repeat guests cancel at a rate of 33.6%.

No of Special Requests copared to Booking Status

6. Many guests have special requirements when booking a room.
Does these affect booking cancellations?

In [53]:
df2.groupby('no_of_special_requests')['booking_status'].value_counts()
Out[53]:
no_of_special_requests  booking_status
0                       Not_Canceled      11232
                        Canceled           8545
1                       Not_Canceled       8670
                        Canceled           2703
2                       Not_Canceled       3727
                        Canceled            637
3                       Not_Canceled        675
4                       Not_Canceled         78
5                       Not_Canceled          8
Name: count, dtype: int64
In [54]:
stacked_barplot(df2, "no_of_special_requests", "booking_status")
booking_status          Canceled  Not_Canceled    All
no_of_special_requests                               
All                        11885         24390  36275
0                           8545         11232  19777
1                           2703          8670  11373
2                            637          3727   4364
3                              0           675    675
4                              0            78     78
5                              0             8      8
------------------------------------------------------------------------------------------------------------------------
In [55]:
distribution_plot_wrt_target(df2, "no_of_special_requests", "booking_status")

The more the special request the less chance the guest will cancel.
THe guests that had 3 or more special requests did not cancel in bookings. The guests that had no special requests canceled 43.2% of their bookings. The guesta that had 1 or 2 special requests canceled 21.2% of their bookings.

Room Type Reserved compared to Type of Meal Plan Type

In [56]:
stacked_barplot(df2, "room_type_reserved", "type_of_meal_plan")
type_of_meal_plan   Meal Plan 1  Meal Plan 2  Meal Plan 3  Not Selected    All
room_type_reserved                                                            
All                       27835         3305            5          5130  36275
Room_Type 7                 152            2            3             1    158
Room_Type 1               20157         2934            1          5038  28130
Room_Type 4                5748          273            1            35   6057
Room_Type 2                 653           16            0            23    692
Room_Type 3                   5            0            0             2      7
Room_Type 5                 242           14            0             9    265
Room_Type 6                 878           66            0            22    966
------------------------------------------------------------------------------------------------------------------------

Across all room types meal plan 1 is the most popular.
Second most popular is no meal plan (not selected).

Room Type Reserved compared to Market Segment Type

In [57]:
stacked_barplot(df2, "room_type_reserved", "market_segment_type")
market_segment_type  Aviation  Complementary  Corporate  Offline  Online  \
room_type_reserved                                                         
All                       125            391       2017    10528   23214   
Room_Type 4                65             52         99      613    5228   
Room_Type 1                60            247       1833     9747   16243   
Room_Type 2                 0             20          2       57     613   
Room_Type 3                 0              2          1        2       2   
Room_Type 5                 0             17         74       81      93   
Room_Type 6                 0             14          3       23     926   
Room_Type 7                 0             39          5        5     109   

market_segment_type    All  
room_type_reserved          
All                  36275  
Room_Type 4           6057  
Room_Type 1          28130  
Room_Type 2            692  
Room_Type 3              7  
Room_Type 5            265  
Room_Type 6            966  
Room_Type 7            158  
------------------------------------------------------------------------------------------------------------------------

Aviation guests always choose room type 4 and room type 1.
All rooms are mostly reserved by Online guests, with room type 4, room type 2 and room type 6 being the most popular for online guests.

Market Segement compared to repeated guest

In [226]:
stacked_barplot(df2, "repeated_guest", "market_segment_type")
market_segment_type  Aviation  Complementary  Corporate  Offline  Online  \
repeated_guest                                                             
All                       125            391       2017    10528   23214   
0                         109            265       1415    10438   23118   
1                          16            126        602       90      96   

market_segment_type    All  
repeated_guest              
All                  36275  
0                    35345  
1                      930  
------------------------------------------------------------------------------------------------------------------------

Lead Time compared to Booking Status

In [58]:
stacked_barplot(df2, "lead_time", "booking_status")
booking_status  Canceled  Not_Canceled    All
lead_time                                    
All                11885         24390  36275
188                  142            11    153
166                  122            19    141
245                  111             3    114
1                    110           968   1078
...                  ...           ...    ...
306                    0             2      2
336                    0            15     15
327                    0            15     15
318                    0             1      1
300                    0             1      1

[353 rows x 3 columns]
------------------------------------------------------------------------------------------------------------------------
In [59]:
distribution_plot_wrt_target(df2, "lead_time", "booking_status")

The more the lead time the bugger chance of cancellation.
There are fewer bookings with long lead times.

In [60]:
df2.groupby('booking_status').agg({'lead_time':'mean'}).sort_values(by='lead_time',ascending=False).reset_index()
Out[60]:
booking_status lead_time
0 Canceled 139.22
1 Not_Canceled 58.93

Average lead_time for cancellation is 139 days, whereas the average lead time for not_canceled bookings is ~59 days.

In [61]:
stacked_barplot(df2, "no_of_adults", "booking_status")
booking_status  Canceled  Not_Canceled    All
no_of_adults                                 
All                11885         24390  36275
2                   9119         16989  26108
1                   1856          5839   7695
3                    863          1454   2317
0                     44            95    139
4                      3            13     16
------------------------------------------------------------------------------------------------------------------------
In [62]:
distribution_plot_wrt_target(df2, "no_of_adults", "booking_status")

The amount of adults does not seem to effect whether a guest cancels or not.

In [63]:
stacked_barplot(df2, "no_of_children", "booking_status")
booking_status  Canceled  Not_Canceled    All
no_of_children                               
All                11885         24390  36275
0                  10882         22695  33577
1                    540          1078   1618
2                    457           601   1058
3                      5            14     19
9                      1             1      2
10                     0             1      1
------------------------------------------------------------------------------------------------------------------------
In [64]:
distribution_plot_wrt_target(df2, "no_of_children", "booking_status")

Except for the 10 children, the amount of children only minimally effects whether the guest cancels or not.

In [65]:
stacked_barplot(df2, "no_of_weekend_nights", "booking_status")
booking_status        Canceled  Not_Canceled    All
no_of_weekend_nights                               
All                      11885         24390  36275
0                         5093         11779  16872
1                         3432          6563   9995
2                         3157          5914   9071
4                           83            46    129
3                           74            79    153
5                           29             5     34
6                           16             4     20
7                            1             0      1
------------------------------------------------------------------------------------------------------------------------
In [66]:
distribution_plot_wrt_target(df2, "no_of_weekend_nights", "booking_status")

7 Weekend nights is completely canceled.
As the amount of weekend nights went down the cancellations went down.

In [67]:
stacked_barplot(df2, "no_of_week_nights", "booking_status")
booking_status     Canceled  Not_Canceled    All
no_of_week_nights                               
All                   11885         24390  36275
2                      3997          7447  11444
3                      2574          5265   7839
1                      2572          6916   9488
4                      1143          1847   2990
0                       679          1708   2387
5                       632           982   1614
6                        88           101    189
10                       53             9     62
7                        52            61    113
8                        32            30     62
9                        21            13     34
11                       14             3     17
15                        8             2     10
12                        7             2      9
13                        5             0      5
14                        4             3      7
16                        2             0      2
17                        2             1      3
------------------------------------------------------------------------------------------------------------------------
In [68]:
distribution_plot_wrt_target(df2, "no_of_week_nights", "booking_status")

16and 13 Week nights is completely canceled.
As the amount of week nights went down the cancellations went down.

In [69]:
stacked_barplot(df2, "required_car_parking_space", "booking_status")
booking_status              Canceled  Not_Canceled    All
required_car_parking_space                               
All                            11885         24390  36275
0                              11771         23380  35151
1                                114          1010   1124
------------------------------------------------------------------------------------------------------------------------
In [70]:
distribution_plot_wrt_target(df2, "required_car_parking_space", "booking_status")

Guests that required a parking space canceled less.
Guests who did not require a parking space canceled 33.5% of the time.
Guests who needed a parking space canceled 10.1% of the time.

In [71]:
stacked_barplot(df2, "arrival_year", "booking_status")
booking_status  Canceled  Not_Canceled    All
arrival_year                                 
All                11885         24390  36275
2018               10924         18837  29761
2017                 961          5553   6514
------------------------------------------------------------------------------------------------------------------------
In [72]:
distribution_plot_wrt_target(df2, "arrival_year", "booking_status")

2018 had 36.7% cancel.
2017 had 14.7% cancel. 82% of all guests booked rooms for 2018.

In [73]:
stacked_barplot(df2, "arrival_month", "booking_status")
booking_status  Canceled  Not_Canceled    All
arrival_month                                
All                11885         24390  36275
10                  1880          3437   5317
9                   1538          3073   4611
8                   1488          2325   3813
7                   1314          1606   2920
6                   1291          1912   3203
4                    995          1741   2736
5                    948          1650   2598
11                   875          2105   2980
3                    700          1658   2358
2                    430          1274   1704
12                   402          2619   3021
1                     24           990   1014
------------------------------------------------------------------------------------------------------------------------
In [74]:
distribution_plot_wrt_target(df2, "arrival_month", "booking_status")

Arrival month 10 - cancellations 35.36%
Arrival month 9 - cancellations 33.36%
Arrival month 8 - cancellations 39.02%
Arrival month 7 - cancellations 45.00%
Arrival month 6 - cancellations 40.31%
Arrival month 4 - cancellations 36.37%
Arrival month 5 - cancellations 36.49%
Arrival month 11 - cancellations 29.36%
Arrival month 3 - cancellations 29.69%
Arrival month 2 - cancellations 25.23%
Arrival month 12 - cancellations 13.31%
Arrival month 1 - cancellations 2.37%

In [75]:
stacked_barplot(df2, "arrival_date", "booking_status")
booking_status  Canceled  Not_Canceled    All
arrival_date                                 
All                11885         24390  36275
15                   538           735   1273
4                    474           853   1327
16                   473           833   1306
30                   465           751   1216
1                    465           668   1133
12                   460           744   1204
17                   448           897   1345
6                    444           829   1273
26                   425           721   1146
19                   413           914   1327
20                   413           868   1281
13                   408           950   1358
28                   405           724   1129
3                    403           695   1098
25                   395           751   1146
21                   376           782   1158
24                   372           731   1103
18                   366           894   1260
7                    364           746   1110
8                    356           842   1198
22                   351           672   1023
23                   341           649    990
29                   334           856   1190
11                   330           768   1098
5                    328           826   1154
14                   327           915   1242
10                   318           771   1089
27                   313           746   1059
2                    308          1023   1331
9                    294           836   1130
31                   178           400    578
------------------------------------------------------------------------------------------------------------------------
In [76]:
distribution_plot_wrt_target(df2, "arrival_date", "booking_status")

Top 5 cancelation days:
Day 15 - 42.26%
Day 1 - 41.04%
Day 30 - 38.24%
Day 12 - 38.21%
Day 26 - 37.09%

Lowest 5 cancelation days:
Day 2 - 23.14%
Day 9 - 26.02%
Day 14 - 26.33%
Day 29 - 28.07%
Day 5 - 28.42%

In [77]:
stacked_barplot(df2, "no_of_previous_cancellations", "booking_status")
booking_status                Canceled  Not_Canceled    All
no_of_previous_cancellations                               
All                              11885         24390  36275
0                                11869         24068  35937
1                                   11           187    198
13                                   4             0      4
3                                    1            42     43
2                                    0            46     46
4                                    0            10     10
5                                    0            11     11
6                                    0             1      1
11                                   0            25     25
------------------------------------------------------------------------------------------------------------------------
In [78]:
distribution_plot_wrt_target(df2, "no_of_previous_cancellations", "booking_status")

No of previous cancellations 11, 6, 5, 4 and 2 had 0 cancellations.
No of previous cancellations 13 had 100% cancelled.
No of previous cancellations 1 had 5.56% cancelled.
No of previous cancellations 0 had 33.03% cancelled.

In [79]:
stacked_barplot(df2, "no_of_previous_bookings_not_canceled", "booking_status")
booking_status                        Canceled  Not_Canceled    All
no_of_previous_bookings_not_canceled                               
All                                      11885         24390  36275
0                                        11878         23585  35463
1                                            4           224    228
12                                           1            11     12
4                                            1            64     65
6                                            1            35     36
2                                            0           112    112
44                                           0             2      2
43                                           0             1      1
42                                           0             1      1
41                                           0             1      1
40                                           0             1      1
38                                           0             1      1
39                                           0             1      1
46                                           0             1      1
37                                           0             1      1
36                                           0             1      1
35                                           0             1      1
45                                           0             1      1
48                                           0             2      2
47                                           0             1      1
33                                           0             1      1
49                                           0             1      1
50                                           0             1      1
51                                           0             1      1
52                                           0             1      1
53                                           0             1      1
54                                           0             1      1
55                                           0             1      1
56                                           0             1      1
57                                           0             1      1
58                                           0             1      1
34                                           0             1      1
31                                           0             2      2
32                                           0             2      2
3                                            0            80     80
5                                            0            60     60
7                                            0            24     24
8                                            0            23     23
9                                            0            19     19
10                                           0            19     19
11                                           0            15     15
13                                           0             7      7
14                                           0             9      9
15                                           0             8      8
16                                           0             7      7
17                                           0             6      6
18                                           0             6      6
19                                           0             6      6
20                                           0             6      6
21                                           0             6      6
22                                           0             6      6
23                                           0             3      3
24                                           0             3      3
25                                           0             3      3
26                                           0             2      2
27                                           0             3      3
28                                           0             2      2
29                                           0             2      2
30                                           0             2      2
------------------------------------------------------------------------------------------------------------------------
In [80]:
distribution_plot_wrt_target(df2, "no_of_previous_bookings_not_canceled", "booking_status")

All but 0, 1, 12, 4, and 6 had no cancellations.
0 previous bookings not canceled had 33.49% canceled.
1 previous bookings not canceled had 1.75% canceled.
12 previous bookings not canceled had 8.33% canceled.
4 previous bookings not canceled had 1.54% canceled.
6 previous bookings not canceled had 2.78% canceled.

In [81]:
df2.groupby('booking_status').agg({'avg_price_per_room':'mean'}).sort_values(by='avg_price_per_room',ascending=False).reset_index()
Out[81]:
booking_status avg_price_per_room
0 Canceled 110.59
1 Not_Canceled 99.93
In [82]:
distribution_plot_wrt_target(df2, "avg_price_per_room", "booking_status")

Does not look like there is much correlation between booking status and average price per room.
Difference in average price per room is a difference of ~10.

In [83]:
plt.figure(figsize=(30, 6))
ax = sns.countplot(x='market_segment_type', data=df2, hue='booking_status', edgecolor='purple')
plt.xlabel('Market Segment')
plt.ylabel('Guest Count')
plt.title('Cancellation Status by Market Segment')
plt.ylim(0, 22000)

# Group by market segment
grouped_df = df2.groupby('market_segment_type')['booking_status'].value_counts()

# Calculate total bookings per segment
total_counts = df2.groupby('market_segment_type')['booking_status'].count()

# Calculate percentage for each booking status
for segment, status in grouped_df.index:
    count = grouped_df.loc[segment, status]
    total = total_counts.loc[segment]
    percentage = (count / total) * 100
    print(f"Segment: {segment}, Status: {status}, Percentage: {percentage:.1f}%")

# Annotate the bars (count and percentage)
for p in ax.patches:
  cnt = p.get_height()
  prc = "{:.1f}%".format(100.0 * p.get_height() / (df2.shape[0] )) # percentage
  xx = p.get_x() + p.get_width() / 2
  yy = p.get_height()
  ax.annotate(f"{prc}", (xx, yy), ha="center", va="center", size=12, xytext=(0, 10), textcoords="offset points") # annotate percentage
  ax.annotate(cnt, (xx, yy + 1000), ha="center", va="center", size=12, xytext=(0, 10), textcoords="offset points")

# Show the plot
plt.show()
Out[83]:
<Figure size 3000x600 with 0 Axes>
Out[83]:
Text(0.5, 0, 'Market Segment')
Out[83]:
Text(0, 0.5, 'Guest Count')
Out[83]:
Text(0.5, 1.0, 'Cancellation Status by Market Segment')
Out[83]:
(0.0, 22000.0)
Segment: Aviation, Status: Not_Canceled, Percentage: 70.4%
Segment: Aviation, Status: Canceled, Percentage: 29.6%
Segment: Complementary, Status: Not_Canceled, Percentage: 100.0%
Segment: Corporate, Status: Not_Canceled, Percentage: 89.1%
Segment: Corporate, Status: Canceled, Percentage: 10.9%
Segment: Offline, Status: Not_Canceled, Percentage: 70.1%
Segment: Offline, Status: Canceled, Percentage: 29.9%
Segment: Online, Status: Not_Canceled, Percentage: 63.5%
Segment: Online, Status: Canceled, Percentage: 36.5%
Out[83]:
Text(0, 10, '20.3%')
Out[83]:
Text(0, 10, '7375.0')
Out[83]:
Text(0, 10, '40.6%')
Out[83]:
Text(0, 10, '14739.0')
Out[83]:
Text(0, 10, '5.0%')
Out[83]:
Text(0, 10, '1797.0')
Out[83]:
Text(0, 10, '0.2%')
Out[83]:
Text(0, 10, '88.0')
Out[83]:
Text(0, 10, '1.1%')
Out[83]:
Text(0, 10, '391.0')
Out[83]:
Text(0, 10, '8.7%')
Out[83]:
Text(0, 10, '3153.0')
Out[83]:
Text(0, 10, '23.4%')
Out[83]:
Text(0, 10, '8475.0')
Out[83]:
Text(0, 10, '0.6%')
Out[83]:
Text(0, 10, '220.0')
Out[83]:
Text(0, 10, '0.1%')
Out[83]:
Text(0, 10, '37.0')
Out[83]:
Text(0, 10, '0.0%')
Out[83]:
Text(0, 10, '0')
Out[83]:
Text(0, 10, '0.0%')
Out[83]:
Text(0, 10, '0')
In [84]:
distribution_plot_wrt_target(df2, "market_segment_type", "booking_status")

Online bookings account for 64% of all bookings. Of those 63% of the bookings are not canceled, whereas 37% are canceled.
Offline bookings account for 29% of all bookings. Of those 70% of the bookings are not canceled, whereas 30% are canceled.
Corporate bookings account for 5.6% of all bookings. Of those 89% of the bookings are not canceled, whereas 11% are canceled.
Complementary bookings account for 1.1% of all bookings. Of those 100% of the bookings are not canceled.
Aviation bookings account for 0.3% of all bookings. Of those 70% of the bookings are not canceled, whereas 30% are canceled.

Data Preprocessing¶

  • Missing value treatment (if needed)
  • Feature engineering (if needed)
  • Outlier detection and treatment (if needed)
  • Preparing data for modeling
  • Any other preprocessing steps (if needed)
In [ ]:
 

Does not appear that whether or not it is a weekend or a weekday makes a difference when it comes to cancellations.
As a result converting to total nights and dropping no_of_weekend_nights and no-of week_nights.

In [85]:
# Make a copy in case there is any issues
df3 = df.copy()
In [86]:
df3['total_nights'] = df3['no_of_weekend_nights'] + df3['no_of_week_nights']
df3.drop(labels='no_of_weekend_nights', axis=1, inplace=True)
df3.drop(labels='no_of_week_nights', axis=1, inplace=True)

df3.drop(labels='Booking_ID', axis=1, inplace=True)
In [87]:
df2['total_nights'] = df2['no_of_weekend_nights'] + df2['no_of_week_nights']
df2.drop(labels='no_of_weekend_nights', axis=1, inplace=True)
df2.drop(labels='no_of_week_nights', axis=1, inplace=True)
In [88]:
df2.head()
Out[88]:
Booking_ID no_of_adults no_of_children type_of_meal_plan required_car_parking_space room_type_reserved lead_time arrival_year arrival_month arrival_date market_segment_type repeated_guest no_of_previous_cancellations no_of_previous_bookings_not_canceled avg_price_per_room no_of_special_requests booking_status total_nights
0 INN00001 2 0 Meal Plan 1 0 Room_Type 1 224 2017 10 2 Offline 0 0 0 65.00 0 Not_Canceled 3
1 INN00002 2 0 Not Selected 0 Room_Type 1 5 2018 11 6 Online 0 0 0 106.68 1 Not_Canceled 5
2 INN00003 1 0 Meal Plan 1 0 Room_Type 1 1 2018 2 28 Online 0 0 0 60.00 0 Canceled 3
3 INN00004 2 0 Meal Plan 1 0 Room_Type 1 211 2018 5 20 Online 0 0 0 100.00 0 Canceled 2
4 INN00005 2 0 Not Selected 0 Room_Type 1 48 2018 4 11 Online 0 0 0 94.50 0 Canceled 2

Booking_Id is not needed to do any modeling. Will not help with any comparisons.

In [89]:
#drop the column *Booking_ID* from the dataframe
df2.drop(labels='Booking_ID', axis=1, inplace=True)
In [90]:
df2.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 36275 entries, 0 to 36274
Data columns (total 17 columns):
 #   Column                                Non-Null Count  Dtype  
---  ------                                --------------  -----  
 0   no_of_adults                          36275 non-null  int64  
 1   no_of_children                        36275 non-null  int64  
 2   type_of_meal_plan                     36275 non-null  object 
 3   required_car_parking_space            36275 non-null  int64  
 4   room_type_reserved                    36275 non-null  object 
 5   lead_time                             36275 non-null  int64  
 6   arrival_year                          36275 non-null  int64  
 7   arrival_month                         36275 non-null  int64  
 8   arrival_date                          36275 non-null  int64  
 9   market_segment_type                   36275 non-null  object 
 10  repeated_guest                        36275 non-null  int64  
 11  no_of_previous_cancellations          36275 non-null  int64  
 12  no_of_previous_bookings_not_canceled  36275 non-null  int64  
 13  avg_price_per_room                    36275 non-null  float64
 14  no_of_special_requests                36275 non-null  int64  
 15  booking_status                        36275 non-null  object 
 16  total_nights                          36275 non-null  int64  
dtypes: float64(1), int64(12), object(4)
memory usage: 4.7+ MB

EDA¶

  • It is a good idea to explore the data once again after manipulating it.
In [91]:
stacked_barplot(df2, "total_nights", "booking_status")
booking_status  Canceled  Not_Canceled    All
total_nights                                 
All                11885         24390  36275
3                   3586          6466  10052
2                   2899          5573   8472
4                   1941          3952   5893
1                   1466          5138   6604
5                    823          1766   2589
6                    465           566   1031
7                    383           590    973
8                     79           100    179
10                    58            51    109
9                     53            58    111
14                    27             5     32
15                    26             5     31
13                    15             3     18
12                    15             9     24
11                    15            24     39
20                     8             3     11
16                     5             1      6
19                     5             1      6
17                     4             1      5
18                     3             0      3
21                     3             1      4
22                     2             0      2
0                      2            76     78
23                     1             1      2
24                     1             0      1
------------------------------------------------------------------------------------------------------------------------
In [92]:
distribution_plot_wrt_target(df2, "total_nights", "booking_status")

As the amount of days increase the cancellation rate decreases.

Outlier Detection¶

In [93]:
# functions to treat outliers by flooring and capping


def treat_outliers(df2, col):
    """
    Treats outliers in a variable

    df2: dataframe
    col: dataframe column
    """
    Q1 = df2[col].quantile(0.25)  # 25th quantile
    Q3 = df2[col].quantile(0.75)  # 75th quantile
    IQR = Q3 - Q1
    Lower_Whisker = Q1 - 1.5 * IQR
    Upper_Whisker = Q3 + 1.5 * IQR

    # all the values smaller than Lower_Whisker will be assigned the value of Lower_Whisker
    # all the values greater than Upper_Whisker will be assigned the value of Upper_Whisker
    df2[col] = np.clip(df2[col], Lower_Whisker, Upper_Whisker)

    return df2


def treat_outliers_all(df2, col_list):
    """
    Treat outliers in a list of variables

    df2: dataframe
    col_list: list of dataframe columns
    """
    for c in col_list:
        df2 = treat_outliers(df2, c)

    return df2
In [94]:
numeric_columns = df2.select_dtypes(include=np.number).columns.to_list()
plt.figure(figsize=(20, 30))
for i, variable in enumerate(numeric_columns):
    plt.subplot(5, 4, i + 1)
    plt.boxplot(df2[variable], whis=1.5)
    plt.tight_layout()
    plt.title(variable)

# Update color properties
boxprops = dict(color="red")  # Change the box color to red
capprops = dict(color="blue")  # Change the cap color to blue
whiskerprops = dict(color="purple")  # Change the whisker color to purple
flierprops = dict(markerfacecolor="teal")  # Change the flier marker color to teal
medianprops = dict(color="violet")  # Change the median line color to violet

# Create the final combined graph
fig, ax = plt.subplots()
ax.set_title('Numerical Column Boxplots')
plt.boxplot(df2[variable], whis=1.5, boxprops=boxprops, capprops=capprops,
            whiskerprops=whiskerprops, flierprops=flierprops, medianprops=medianprops)

# Toggle visibility of the entire figure
def toggle_plot(event):
    plt.gcf().set_visible(not plt.gcf().get_visible())
    plt.draw()

cid = plt.gcf().canvas.mpl_connect("key_press_event", toggle_plot)
plt.show()
Out[94]:
<Figure size 2000x3000 with 0 Axes>
Out[94]:
<Axes: >
Out[94]:
{'whiskers': [<matplotlib.lines.Line2D at 0x7cf388765570>,
  <matplotlib.lines.Line2D at 0x7cf388764a30>],
 'caps': [<matplotlib.lines.Line2D at 0x7cf388765cc0>,
  <matplotlib.lines.Line2D at 0x7cf388764be0>],
 'boxes': [<matplotlib.lines.Line2D at 0x7cf388766440>],
 'medians': [<matplotlib.lines.Line2D at 0x7cf388765d50>],
 'fliers': [<matplotlib.lines.Line2D at 0x7cf388766ad0>],
 'means': []}
Out[94]:
Text(0.5, 1.0, 'no_of_adults')
Out[94]:
<Axes: >
Out[94]:
{'whiskers': [<matplotlib.lines.Line2D at 0x7cf3881699f0>,
  <matplotlib.lines.Line2D at 0x7cf388168a30>],
 'caps': [<matplotlib.lines.Line2D at 0x7cf388168be0>,
  <matplotlib.lines.Line2D at 0x7cf38816a830>],
 'boxes': [<matplotlib.lines.Line2D at 0x7cf38816b190>],
 'medians': [<matplotlib.lines.Line2D at 0x7cf388169780>],
 'fliers': [<matplotlib.lines.Line2D at 0x7cf38816b490>],
 'means': []}
Out[94]:
Text(0.5, 1.0, 'no_of_children')
Out[94]:
<Axes: >
Out[94]:
{'whiskers': [<matplotlib.lines.Line2D at 0x7cf38243af20>,
  <matplotlib.lines.Line2D at 0x7cf38243afe0>],
 'caps': [<matplotlib.lines.Line2D at 0x7cf382439cc0>,
  <matplotlib.lines.Line2D at 0x7cf38243bd60>],
 'boxes': [<matplotlib.lines.Line2D at 0x7cf382439f90>],
 'medians': [<matplotlib.lines.Line2D at 0x7cf382439d50>],
 'fliers': [<matplotlib.lines.Line2D at 0x7cf38243aef0>],
 'means': []}
Out[94]:
Text(0.5, 1.0, 'required_car_parking_space')
Out[94]:
<Axes: >
Out[94]:
{'whiskers': [<matplotlib.lines.Line2D at 0x7cf3814ddcc0>,
  <matplotlib.lines.Line2D at 0x7cf3814dec80>],
 'caps': [<matplotlib.lines.Line2D at 0x7cf3814dfb20>,
  <matplotlib.lines.Line2D at 0x7cf3814deda0>],
 'boxes': [<matplotlib.lines.Line2D at 0x7cf3881bdf90>],
 'medians': [<matplotlib.lines.Line2D at 0x7cf3814de800>],
 'fliers': [<matplotlib.lines.Line2D at 0x7cf3814dee60>],
 'means': []}
Out[94]:
Text(0.5, 1.0, 'lead_time')
Out[94]:
<Axes: >
Out[94]:
{'whiskers': [<matplotlib.lines.Line2D at 0x7cf38887cca0>,
  <matplotlib.lines.Line2D at 0x7cf38887e740>],
 'caps': [<matplotlib.lines.Line2D at 0x7cf38887c340>,
  <matplotlib.lines.Line2D at 0x7cf3822e60e0>],
 'boxes': [<matplotlib.lines.Line2D at 0x7cf38887c760>],
 'medians': [<matplotlib.lines.Line2D at 0x7cf3822e6350>],
 'fliers': [<matplotlib.lines.Line2D at 0x7cf3822e72e0>],
 'means': []}
Out[94]:
Text(0.5, 1.0, 'arrival_year')
Out[94]:
<Axes: >
Out[94]:
{'whiskers': [<matplotlib.lines.Line2D at 0x7cf388a20970>,
  <matplotlib.lines.Line2D at 0x7cf388a23040>],
 'caps': [<matplotlib.lines.Line2D at 0x7cf388a21f60>,
  <matplotlib.lines.Line2D at 0x7cf388a210f0>],
 'boxes': [<matplotlib.lines.Line2D at 0x7cf3884acca0>],
 'medians': [<matplotlib.lines.Line2D at 0x7cf388a22c50>],
 'fliers': [<matplotlib.lines.Line2D at 0x7cf388a206a0>],
 'means': []}
Out[94]:
Text(0.5, 1.0, 'arrival_month')
Out[94]:
<Axes: >
Out[94]:
{'whiskers': [<matplotlib.lines.Line2D at 0x7cf3884249d0>,
  <matplotlib.lines.Line2D at 0x7cf388425210>],
 'caps': [<matplotlib.lines.Line2D at 0x7cf388427f40>,
  <matplotlib.lines.Line2D at 0x7cf388425030>],
 'boxes': [<matplotlib.lines.Line2D at 0x7cf388425060>],
 'medians': [<matplotlib.lines.Line2D at 0x7cf388427c70>],
 'fliers': [<matplotlib.lines.Line2D at 0x7cf3884264d0>],
 'means': []}
Out[94]:
Text(0.5, 1.0, 'arrival_date')
Out[94]:
<Axes: >
Out[94]:
{'whiskers': [<matplotlib.lines.Line2D at 0x7cf382315f30>,
  <matplotlib.lines.Line2D at 0x7cf382314610>],
 'caps': [<matplotlib.lines.Line2D at 0x7cf3823168c0>,
  <matplotlib.lines.Line2D at 0x7cf3823170a0>],
 'boxes': [<matplotlib.lines.Line2D at 0x7cf3823168f0>],
 'medians': [<matplotlib.lines.Line2D at 0x7cf382317640>],
 'fliers': [<matplotlib.lines.Line2D at 0x7cf3823148e0>],
 'means': []}
Out[94]:
Text(0.5, 1.0, 'repeated_guest')
Out[94]:
<Axes: >
Out[94]:
{'whiskers': [<matplotlib.lines.Line2D at 0x7cf3813fe8c0>,
  <matplotlib.lines.Line2D at 0x7cf3813ff9d0>],
 'caps': [<matplotlib.lines.Line2D at 0x7cf3813ffa30>,
  <matplotlib.lines.Line2D at 0x7cf3813ffd60>],
 'boxes': [<matplotlib.lines.Line2D at 0x7cf3813fe6b0>],
 'medians': [<matplotlib.lines.Line2D at 0x7cf3813ffe20>],
 'fliers': [<matplotlib.lines.Line2D at 0x7cf3813ff550>],
 'means': []}
Out[94]:
Text(0.5, 1.0, 'no_of_previous_cancellations')
Out[94]:
<Axes: >
Out[94]:
{'whiskers': [<matplotlib.lines.Line2D at 0x7cf38873e350>,
  <matplotlib.lines.Line2D at 0x7cf38873dbd0>],
 'caps': [<matplotlib.lines.Line2D at 0x7cf38873d5d0>,
  <matplotlib.lines.Line2D at 0x7cf38873cc70>],
 'boxes': [<matplotlib.lines.Line2D at 0x7cf38873f5e0>],
 'medians': [<matplotlib.lines.Line2D at 0x7cf38873e740>],
 'fliers': [<matplotlib.lines.Line2D at 0x7cf38873c250>],
 'means': []}
Out[94]:
Text(0.5, 1.0, 'no_of_previous_bookings_not_canceled')
Out[94]:
<Axes: >
Out[94]:
{'whiskers': [<matplotlib.lines.Line2D at 0x7cf381f75c00>,
  <matplotlib.lines.Line2D at 0x7cf38218d390>],
 'caps': [<matplotlib.lines.Line2D at 0x7cf38218dba0>,
  <matplotlib.lines.Line2D at 0x7cf38218e050>],
 'boxes': [<matplotlib.lines.Line2D at 0x7cf381f75660>],
 'medians': [<matplotlib.lines.Line2D at 0x7cf38218ca30>],
 'fliers': [<matplotlib.lines.Line2D at 0x7cf38218fc10>],
 'means': []}
Out[94]:
Text(0.5, 1.0, 'avg_price_per_room')
Out[94]:
<Axes: >
Out[94]:
{'whiskers': [<matplotlib.lines.Line2D at 0x7cf388a71cf0>,
  <matplotlib.lines.Line2D at 0x7cf388a71a20>],
 'caps': [<matplotlib.lines.Line2D at 0x7cf388a70df0>,
  <matplotlib.lines.Line2D at 0x7cf388a72ef0>],
 'boxes': [<matplotlib.lines.Line2D at 0x7cf388a732e0>],
 'medians': [<matplotlib.lines.Line2D at 0x7cf388a717b0>],
 'fliers': [<matplotlib.lines.Line2D at 0x7cf388a70f70>],
 'means': []}
Out[94]:
Text(0.5, 1.0, 'no_of_special_requests')
Out[94]:
<Axes: >
Out[94]:
{'whiskers': [<matplotlib.lines.Line2D at 0x7cf388860400>,
  <matplotlib.lines.Line2D at 0x7cf388862bf0>],
 'caps': [<matplotlib.lines.Line2D at 0x7cf388861150>,
  <matplotlib.lines.Line2D at 0x7cf388860640>],
 'boxes': [<matplotlib.lines.Line2D at 0x7cf388860790>],
 'medians': [<matplotlib.lines.Line2D at 0x7cf388862e00>],
 'fliers': [<matplotlib.lines.Line2D at 0x7cf3818250f0>],
 'means': []}
Out[94]:
Text(0.5, 1.0, 'total_nights')
Out[94]:
Text(0.5, 1.0, 'Numerical Column Boxplots')
Out[94]:
{'whiskers': [<matplotlib.lines.Line2D at 0x7cf3814441f0>,
  <matplotlib.lines.Line2D at 0x7cf3814471f0>],
 'caps': [<matplotlib.lines.Line2D at 0x7cf381445450>,
  <matplotlib.lines.Line2D at 0x7cf381447e50>],
 'boxes': [<matplotlib.lines.Line2D at 0x7cf381446470>],
 'medians': [<matplotlib.lines.Line2D at 0x7cf381444910>],
 'fliers': [<matplotlib.lines.Line2D at 0x7cf381447ca0>],
 'means': []}
In [95]:
# Checking the distrinbution of all numeric columns using histplot.

plt.figure(figsize=(15, 45))

for i in range(len(numeric_columns)):
    plt.subplot(12, 3, i + 1)
    plt.hist(df2[numeric_columns[i]], bins=50,color = "teal")
    plt.tight_layout()
    plt.title(numeric_columns[i], fontsize=25)

plt.show(),
Out[95]:
<Figure size 1500x4500 with 0 Axes>
Out[95]:
<Axes: >
Out[95]:
(array([1.3900e+02, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 7.6950e+03, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        2.6108e+04, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 2.3170e+03, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 1.6000e+01]),
 array([0.  , 0.08, 0.16, 0.24, 0.32, 0.4 , 0.48, 0.56, 0.64, 0.72, 0.8 ,
        0.88, 0.96, 1.04, 1.12, 1.2 , 1.28, 1.36, 1.44, 1.52, 1.6 , 1.68,
        1.76, 1.84, 1.92, 2.  , 2.08, 2.16, 2.24, 2.32, 2.4 , 2.48, 2.56,
        2.64, 2.72, 2.8 , 2.88, 2.96, 3.04, 3.12, 3.2 , 3.28, 3.36, 3.44,
        3.52, 3.6 , 3.68, 3.76, 3.84, 3.92, 4.  ]),
 <BarContainer object of 50 artists>)
Out[95]:
Text(0.5, 1.0, 'no_of_adults')
Out[95]:
<Axes: >
Out[95]:
(array([3.3577e+04, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        1.6180e+03, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        1.0580e+03, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        1.9000e+01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        2.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 1.0000e+00]),
 array([ 0. ,  0.2,  0.4,  0.6,  0.8,  1. ,  1.2,  1.4,  1.6,  1.8,  2. ,
         2.2,  2.4,  2.6,  2.8,  3. ,  3.2,  3.4,  3.6,  3.8,  4. ,  4.2,
         4.4,  4.6,  4.8,  5. ,  5.2,  5.4,  5.6,  5.8,  6. ,  6.2,  6.4,
         6.6,  6.8,  7. ,  7.2,  7.4,  7.6,  7.8,  8. ,  8.2,  8.4,  8.6,
         8.8,  9. ,  9.2,  9.4,  9.6,  9.8, 10. ]),
 <BarContainer object of 50 artists>)
Out[95]:
Text(0.5, 1.0, 'no_of_children')
Out[95]:
<Axes: >
Out[95]:
(array([35151.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
            0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
            0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
            0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
            0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
            0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
            0.,  1124.]),
 array([0.  , 0.02, 0.04, 0.06, 0.08, 0.1 , 0.12, 0.14, 0.16, 0.18, 0.2 ,
        0.22, 0.24, 0.26, 0.28, 0.3 , 0.32, 0.34, 0.36, 0.38, 0.4 , 0.42,
        0.44, 0.46, 0.48, 0.5 , 0.52, 0.54, 0.56, 0.58, 0.6 , 0.62, 0.64,
        0.66, 0.68, 0.7 , 0.72, 0.74, 0.76, 0.78, 0.8 , 0.82, 0.84, 0.86,
        0.88, 0.9 , 0.92, 0.94, 0.96, 0.98, 1.  ]),
 <BarContainer object of 50 artists>)
Out[95]:
Text(0.5, 1.0, 'required_car_parking_space')
Out[95]:
<Axes: >
Out[95]:
(array([6.237e+03, 2.989e+03, 2.198e+03, 2.253e+03, 2.126e+03, 1.505e+03,
        1.715e+03, 1.303e+03, 1.427e+03, 1.263e+03, 9.910e+02, 1.240e+03,
        9.040e+02, 9.360e+02, 5.840e+02, 6.470e+02, 6.760e+02, 6.080e+02,
        7.250e+02, 4.240e+02, 5.840e+02, 5.830e+02, 2.970e+02, 4.030e+02,
        3.860e+02, 2.180e+02, 1.730e+02, 2.870e+02, 2.370e+02, 2.940e+02,
        3.050e+02, 3.320e+02, 2.350e+02, 1.560e+02, 2.860e+02, 1.290e+02,
        1.620e+02, 5.700e+01, 2.500e+01, 1.100e+02, 2.200e+01, 1.000e+00,
        6.900e+01, 7.100e+01, 0.000e+00, 0.000e+00, 0.000e+00, 6.000e+01,
        2.000e+01, 2.200e+01]),
 array([  0.  ,   8.86,  17.72,  26.58,  35.44,  44.3 ,  53.16,  62.02,
         70.88,  79.74,  88.6 ,  97.46, 106.32, 115.18, 124.04, 132.9 ,
        141.76, 150.62, 159.48, 168.34, 177.2 , 186.06, 194.92, 203.78,
        212.64, 221.5 , 230.36, 239.22, 248.08, 256.94, 265.8 , 274.66,
        283.52, 292.38, 301.24, 310.1 , 318.96, 327.82, 336.68, 345.54,
        354.4 , 363.26, 372.12, 380.98, 389.84, 398.7 , 407.56, 416.42,
        425.28, 434.14, 443.  ]),
 <BarContainer object of 50 artists>)
Out[95]:
Text(0.5, 1.0, 'lead_time')
Out[95]:
<Axes: >
Out[95]:
(array([ 6514.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
            0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
            0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
            0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
            0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
            0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
            0., 29761.]),
 array([2017.  , 2017.02, 2017.04, 2017.06, 2017.08, 2017.1 , 2017.12,
        2017.14, 2017.16, 2017.18, 2017.2 , 2017.22, 2017.24, 2017.26,
        2017.28, 2017.3 , 2017.32, 2017.34, 2017.36, 2017.38, 2017.4 ,
        2017.42, 2017.44, 2017.46, 2017.48, 2017.5 , 2017.52, 2017.54,
        2017.56, 2017.58, 2017.6 , 2017.62, 2017.64, 2017.66, 2017.68,
        2017.7 , 2017.72, 2017.74, 2017.76, 2017.78, 2017.8 , 2017.82,
        2017.84, 2017.86, 2017.88, 2017.9 , 2017.92, 2017.94, 2017.96,
        2017.98, 2018.  ]),
 <BarContainer object of 50 artists>)
Out[95]:
Text(0.5, 1.0, 'arrival_year')
Out[95]:
<Axes: >
Out[95]:
(array([1014.,    0.,    0.,    0., 1704.,    0.,    0.,    0.,    0.,
        2358.,    0.,    0.,    0., 2736.,    0.,    0.,    0.,    0.,
        2598.,    0.,    0.,    0., 3203.,    0.,    0.,    0.,    0.,
        2920.,    0.,    0.,    0., 3813.,    0.,    0.,    0.,    0.,
        4611.,    0.,    0.,    0., 5317.,    0.,    0.,    0.,    0.,
        2980.,    0.,    0.,    0., 3021.]),
 array([ 1.  ,  1.22,  1.44,  1.66,  1.88,  2.1 ,  2.32,  2.54,  2.76,
         2.98,  3.2 ,  3.42,  3.64,  3.86,  4.08,  4.3 ,  4.52,  4.74,
         4.96,  5.18,  5.4 ,  5.62,  5.84,  6.06,  6.28,  6.5 ,  6.72,
         6.94,  7.16,  7.38,  7.6 ,  7.82,  8.04,  8.26,  8.48,  8.7 ,
         8.92,  9.14,  9.36,  9.58,  9.8 , 10.02, 10.24, 10.46, 10.68,
        10.9 , 11.12, 11.34, 11.56, 11.78, 12.  ]),
 <BarContainer object of 50 artists>)
Out[95]:
Text(0.5, 1.0, 'arrival_month')
Out[95]:
<Axes: >
Out[95]:
(array([1133., 1331.,    0., 1098.,    0., 1327., 1154.,    0., 1273.,
           0., 1110., 1198.,    0., 1130.,    0., 1089., 1098.,    0.,
        1204.,    0., 1358., 1242.,    0., 1273.,    0., 1306., 1345.,
           0., 1260.,    0., 1327., 1281.,    0., 1158.,    0., 1023.,
         990.,    0., 1103.,    0., 1146., 1146.,    0., 1059.,    0.,
        1129., 1190.,    0., 1216.,  578.]),
 array([ 1. ,  1.6,  2.2,  2.8,  3.4,  4. ,  4.6,  5.2,  5.8,  6.4,  7. ,
         7.6,  8.2,  8.8,  9.4, 10. , 10.6, 11.2, 11.8, 12.4, 13. , 13.6,
        14.2, 14.8, 15.4, 16. , 16.6, 17.2, 17.8, 18.4, 19. , 19.6, 20.2,
        20.8, 21.4, 22. , 22.6, 23.2, 23.8, 24.4, 25. , 25.6, 26.2, 26.8,
        27.4, 28. , 28.6, 29.2, 29.8, 30.4, 31. ]),
 <BarContainer object of 50 artists>)
Out[95]:
Text(0.5, 1.0, 'arrival_date')
Out[95]:
<Axes: >
Out[95]:
(array([35345.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
            0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
            0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
            0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
            0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
            0.,     0.,     0.,     0.,     0.,     0.,     0.,     0.,
            0.,   930.]),
 array([0.  , 0.02, 0.04, 0.06, 0.08, 0.1 , 0.12, 0.14, 0.16, 0.18, 0.2 ,
        0.22, 0.24, 0.26, 0.28, 0.3 , 0.32, 0.34, 0.36, 0.38, 0.4 , 0.42,
        0.44, 0.46, 0.48, 0.5 , 0.52, 0.54, 0.56, 0.58, 0.6 , 0.62, 0.64,
        0.66, 0.68, 0.7 , 0.72, 0.74, 0.76, 0.78, 0.8 , 0.82, 0.84, 0.86,
        0.88, 0.9 , 0.92, 0.94, 0.96, 0.98, 1.  ]),
 <BarContainer object of 50 artists>)
Out[95]:
Text(0.5, 1.0, 'repeated_guest')
Out[95]:
<Axes: >
Out[95]:
(array([3.5937e+04, 0.0000e+00, 0.0000e+00, 1.9800e+02, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 4.6000e+01, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 4.3000e+01, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        1.0000e+01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 1.1000e+01,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 1.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 2.5000e+01, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 4.0000e+00]),
 array([ 0.  ,  0.26,  0.52,  0.78,  1.04,  1.3 ,  1.56,  1.82,  2.08,
         2.34,  2.6 ,  2.86,  3.12,  3.38,  3.64,  3.9 ,  4.16,  4.42,
         4.68,  4.94,  5.2 ,  5.46,  5.72,  5.98,  6.24,  6.5 ,  6.76,
         7.02,  7.28,  7.54,  7.8 ,  8.06,  8.32,  8.58,  8.84,  9.1 ,
         9.36,  9.62,  9.88, 10.14, 10.4 , 10.66, 10.92, 11.18, 11.44,
        11.7 , 11.96, 12.22, 12.48, 12.74, 13.  ]),
 <BarContainer object of 50 artists>)
Out[95]:
Text(0.5, 1.0, 'no_of_previous_cancellations')
Out[95]:
<Axes: >
Out[95]:
(array([3.5691e+04, 1.1200e+02, 8.0000e+01, 6.5000e+01, 6.0000e+01,
        3.6000e+01, 4.7000e+01, 1.9000e+01, 1.9000e+01, 1.5000e+01,
        1.2000e+01, 7.0000e+00, 1.7000e+01, 7.0000e+00, 6.0000e+00,
        6.0000e+00, 6.0000e+00, 6.0000e+00, 1.2000e+01, 3.0000e+00,
        3.0000e+00, 3.0000e+00, 2.0000e+00, 3.0000e+00, 2.0000e+00,
        4.0000e+00, 2.0000e+00, 2.0000e+00, 1.0000e+00, 1.0000e+00,
        1.0000e+00, 2.0000e+00, 1.0000e+00, 1.0000e+00, 1.0000e+00,
        1.0000e+00, 1.0000e+00, 3.0000e+00, 1.0000e+00, 1.0000e+00,
        1.0000e+00, 2.0000e+00, 1.0000e+00, 2.0000e+00, 1.0000e+00,
        1.0000e+00, 1.0000e+00, 1.0000e+00, 1.0000e+00, 2.0000e+00]),
 array([ 0.  ,  1.16,  2.32,  3.48,  4.64,  5.8 ,  6.96,  8.12,  9.28,
        10.44, 11.6 , 12.76, 13.92, 15.08, 16.24, 17.4 , 18.56, 19.72,
        20.88, 22.04, 23.2 , 24.36, 25.52, 26.68, 27.84, 29.  , 30.16,
        31.32, 32.48, 33.64, 34.8 , 35.96, 37.12, 38.28, 39.44, 40.6 ,
        41.76, 42.92, 44.08, 45.24, 46.4 , 47.56, 48.72, 49.88, 51.04,
        52.2 , 53.36, 54.52, 55.68, 56.84, 58.  ]),
 <BarContainer object of 50 artists>)
Out[95]:
Text(0.5, 1.0, 'no_of_previous_bookings_not_canceled')
Out[95]:
<Axes: >
Out[95]:
(array([5.980e+02, 2.900e+01, 1.400e+01, 9.700e+01, 3.430e+02, 1.234e+03,
        4.538e+03, 4.626e+03, 5.922e+03, 4.325e+03, 4.538e+03, 2.968e+03,
        2.485e+03, 1.596e+03, 9.830e+02, 6.270e+02, 3.880e+02, 3.070e+02,
        2.270e+02, 1.480e+02, 1.190e+02, 6.600e+01, 2.600e+01, 2.400e+01,
        1.600e+01, 7.000e+00, 6.000e+00, 9.000e+00, 2.000e+00, 2.000e+00,
        1.000e+00, 0.000e+00, 1.000e+00, 1.000e+00, 1.000e+00, 0.000e+00,
        0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00,
        0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00, 0.000e+00,
        0.000e+00, 1.000e+00]),
 array([  0. ,  10.8,  21.6,  32.4,  43.2,  54. ,  64.8,  75.6,  86.4,
         97.2, 108. , 118.8, 129.6, 140.4, 151.2, 162. , 172.8, 183.6,
        194.4, 205.2, 216. , 226.8, 237.6, 248.4, 259.2, 270. , 280.8,
        291.6, 302.4, 313.2, 324. , 334.8, 345.6, 356.4, 367.2, 378. ,
        388.8, 399.6, 410.4, 421.2, 432. , 442.8, 453.6, 464.4, 475.2,
        486. , 496.8, 507.6, 518.4, 529.2, 540. ]),
 <BarContainer object of 50 artists>)
Out[95]:
Text(0.5, 1.0, 'avg_price_per_room')
Out[95]:
<Axes: >
Out[95]:
(array([1.9777e+04, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        1.1373e+04, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        4.3640e+03, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        6.7500e+02, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        7.8000e+01, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 8.0000e+00]),
 array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. , 1.1, 1.2,
        1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2. , 2.1, 2.2, 2.3, 2.4, 2.5,
        2.6, 2.7, 2.8, 2.9, 3. , 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8,
        3.9, 4. , 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5. ]),
 <BarContainer object of 50 artists>)
Out[95]:
Text(0.5, 1.0, 'no_of_special_requests')
Out[95]:
<Axes: >
Out[95]:
(array([7.8000e+01, 0.0000e+00, 6.6040e+03, 0.0000e+00, 8.4720e+03,
        0.0000e+00, 1.0052e+04, 0.0000e+00, 5.8930e+03, 0.0000e+00,
        2.5890e+03, 0.0000e+00, 1.0310e+03, 0.0000e+00, 9.7300e+02,
        0.0000e+00, 1.7900e+02, 0.0000e+00, 1.1100e+02, 0.0000e+00,
        1.0900e+02, 0.0000e+00, 3.9000e+01, 0.0000e+00, 0.0000e+00,
        2.4000e+01, 0.0000e+00, 1.8000e+01, 0.0000e+00, 3.2000e+01,
        0.0000e+00, 3.1000e+01, 0.0000e+00, 6.0000e+00, 0.0000e+00,
        5.0000e+00, 0.0000e+00, 3.0000e+00, 0.0000e+00, 6.0000e+00,
        0.0000e+00, 1.1000e+01, 0.0000e+00, 4.0000e+00, 0.0000e+00,
        2.0000e+00, 0.0000e+00, 2.0000e+00, 0.0000e+00, 1.0000e+00]),
 array([ 0.  ,  0.48,  0.96,  1.44,  1.92,  2.4 ,  2.88,  3.36,  3.84,
         4.32,  4.8 ,  5.28,  5.76,  6.24,  6.72,  7.2 ,  7.68,  8.16,
         8.64,  9.12,  9.6 , 10.08, 10.56, 11.04, 11.52, 12.  , 12.48,
        12.96, 13.44, 13.92, 14.4 , 14.88, 15.36, 15.84, 16.32, 16.8 ,
        17.28, 17.76, 18.24, 18.72, 19.2 , 19.68, 20.16, 20.64, 21.12,
        21.6 , 22.08, 22.56, 23.04, 23.52, 24.  ]),
 <BarContainer object of 50 artists>)
Out[95]:
Text(0.5, 1.0, 'total_nights')
Out[95]:
(None,)
In [96]:
#calculate interquartile range for average room price, 120 (75%), 80.30 (25%)
IQR = (120-80.30)
In [97]:
#create dataframes of rooms sold for no price (free), rooms sold for a low outlier average room price, and rooms sold for a high outlier average room price
df2_0 = df2[df2.avg_price_per_room == 0]
df2_low = df2[df2.avg_price_per_room < 99.45-1.5*IQR]
df2_high = df2[df2.avg_price_per_room > 99.45+1.5*IQR]
In [98]:
#shows the room price of zero
for colname in df2_0.dtypes[df2.dtypes == 'category'].index:
    print(df2_0[colname].value_counts(dropna=False))
    print(" ")
In [99]:
#shows the room price that is a low outlier
for colname in df2_low.dtypes[df2.dtypes == 'category'].index:
    print(df2_low[colname].value_counts(dropna=False))
    print(" ")
In [100]:
#shows the room price that is a low outlier
for colname in df2_high.dtypes[df2.dtypes == 'category'].index:
    print(df2_high[colname].value_counts(dropna=False))
    print(" ")
In [101]:
# Labeled barplot for type of meal plan
labeled_barplot(df2_high, "type_of_meal_plan", perc=True, n=10)
print()
labeled_barplot(df2_high, "room_type_reserved", perc=True, n=10)
print()
labeled_barplot(df2_high, "market_segment_type", perc=True, n=10)
print()
labeled_barplot(df2_high, "booking_status", perc=True, n=10)


There are 2301 high outliers.
82% of the high outliers chose meal plan 1.
35.2% of high outliers chose room type 4.
Most popular room types are room type 4, room type 6 and room type 1. 91.7% of the high outliers reserved online.
63.2% of the high outliers were not canceled, which means 36.8% of the bookings were canceled.
545 rooms were sold at no cost to the guests. Only 1% of those bookings were canceled.
686 rooms were sold at a low outlier amount to the guests. Only 3% of those bookings were canceled.

In [102]:
# compute adjusted R-squared
def adj_r2_score(predictors, targets, predictions):
    r2 = r2_score(targets, predictions)
    n = predictors.shape[0]
    k = predictors.shape[1]
    return 1 - ((1 - r2) * (n - 1) / (n - k - 1))


# compute MAPE
def mape_score(targets, predictions):
    return np.mean(np.abs(targets - predictions) / targets) * 100


# compute multiple metrics to check performance of a regression model
def model_performance_regression(model, predictors, target):
    """
    Function to compute different metrics to check regression model performance

    model: regressor
    predictors: independent variables
    target: dependent variable
    """

    # predicting using the independent variables
    pred = model.predict(predictors)

    r2 = r2_score(target, pred)  # to compute R-squared
    adjr2 = adj_r2_score(predictors, target, pred)  # to compute adjusted R-squared
    rmse = np.sqrt(mean_squared_error(target, pred))  # to compute RMSE
    mae = mean_absolute_error(target, pred)  # to compute MAE
    mape = mape_score(target, pred)  # to compute MAPE

    # creating a dataframe of metrics
    df_perf = pd.DataFrame(
        {
            "RMSE": rmse,
            "MAE": mae,
            "R-squared": r2,
            "Adj. R-squared": adjr2,
            "MAPE": mape,
              },
        index=[0],
    )

    return df_perf

Linear Regression models¶

Encoding Not Canceled as 0 and Canceled as 1, The hotel wants to be able to predict customers that might cancel their booking

In [103]:
df2["booking_status"] = df2["booking_status"].apply(lambda x: 1 if x == "Canceled" else 0)
In [104]:
df2["booking_status"].value_counts()
Out[104]:
booking_status
0    24390
1    11885
Name: count, dtype: int64
In [105]:
df2.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 36275 entries, 0 to 36274
Data columns (total 17 columns):
 #   Column                                Non-Null Count  Dtype  
---  ------                                --------------  -----  
 0   no_of_adults                          36275 non-null  int64  
 1   no_of_children                        36275 non-null  int64  
 2   type_of_meal_plan                     36275 non-null  object 
 3   required_car_parking_space            36275 non-null  int64  
 4   room_type_reserved                    36275 non-null  object 
 5   lead_time                             36275 non-null  int64  
 6   arrival_year                          36275 non-null  int64  
 7   arrival_month                         36275 non-null  int64  
 8   arrival_date                          36275 non-null  int64  
 9   market_segment_type                   36275 non-null  object 
 10  repeated_guest                        36275 non-null  int64  
 11  no_of_previous_cancellations          36275 non-null  int64  
 12  no_of_previous_bookings_not_canceled  36275 non-null  int64  
 13  avg_price_per_room                    36275 non-null  float64
 14  no_of_special_requests                36275 non-null  int64  
 15  booking_status                        36275 non-null  int64  
 16  total_nights                          36275 non-null  int64  
dtypes: float64(1), int64(13), object(3)
memory usage: 4.7+ MB

Convert Categorical to Numerical Values

In [106]:
for colname in df2.dtypes[df2.dtypes == 'category'].index:
    print(df2[colname].value_counts(dropna=False))
    print(" ")

Spliting the Data

In [107]:
X = df2.drop('booking_status',axis=1)     # Predictor feature columns (8 X m)
Y = df2['booking_status']   # Predicted class (1=True, 0=False) (1 X m)
In [108]:
Y.info()
<class 'pandas.core.series.Series'>
RangeIndex: 36275 entries, 0 to 36274
Series name: booking_status
Non-Null Count  Dtype
--------------  -----
36275 non-null  int64
dtypes: int64(1)
memory usage: 283.5 KB
In [109]:
# Identify object-type columns
object_cols = X.select_dtypes(include=['object','category']).columns

# Convert object-type columns to dummy variables
X = pd.get_dummies(X, columns=object_cols, dtype=int, drop_first=True)  # Drop the first category to avoid multicollinearity
# Ensure te output is integer (numeric 0 and 1) instead of Boolean
In [110]:
X.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 36275 entries, 0 to 36274
Data columns (total 26 columns):
 #   Column                                Non-Null Count  Dtype  
---  ------                                --------------  -----  
 0   no_of_adults                          36275 non-null  int64  
 1   no_of_children                        36275 non-null  int64  
 2   required_car_parking_space            36275 non-null  int64  
 3   lead_time                             36275 non-null  int64  
 4   arrival_year                          36275 non-null  int64  
 5   arrival_month                         36275 non-null  int64  
 6   arrival_date                          36275 non-null  int64  
 7   repeated_guest                        36275 non-null  int64  
 8   no_of_previous_cancellations          36275 non-null  int64  
 9   no_of_previous_bookings_not_canceled  36275 non-null  int64  
 10  avg_price_per_room                    36275 non-null  float64
 11  no_of_special_requests                36275 non-null  int64  
 12  total_nights                          36275 non-null  int64  
 13  type_of_meal_plan_Meal Plan 2         36275 non-null  int64  
 14  type_of_meal_plan_Meal Plan 3         36275 non-null  int64  
 15  type_of_meal_plan_Not Selected        36275 non-null  int64  
 16  room_type_reserved_Room_Type 2        36275 non-null  int64  
 17  room_type_reserved_Room_Type 3        36275 non-null  int64  
 18  room_type_reserved_Room_Type 4        36275 non-null  int64  
 19  room_type_reserved_Room_Type 5        36275 non-null  int64  
 20  room_type_reserved_Room_Type 6        36275 non-null  int64  
 21  room_type_reserved_Room_Type 7        36275 non-null  int64  
 22  market_segment_type_Complementary     36275 non-null  int64  
 23  market_segment_type_Corporate         36275 non-null  int64  
 24  market_segment_type_Offline           36275 non-null  int64  
 25  market_segment_type_Online            36275 non-null  int64  
dtypes: float64(1), int64(25)
memory usage: 7.2 MB
In [111]:
X.head()
Out[111]:
no_of_adults no_of_children required_car_parking_space lead_time arrival_year arrival_month arrival_date repeated_guest no_of_previous_cancellations no_of_previous_bookings_not_canceled avg_price_per_room no_of_special_requests total_nights type_of_meal_plan_Meal Plan 2 type_of_meal_plan_Meal Plan 3 type_of_meal_plan_Not Selected room_type_reserved_Room_Type 2 room_type_reserved_Room_Type 3 room_type_reserved_Room_Type 4 room_type_reserved_Room_Type 5 room_type_reserved_Room_Type 6 room_type_reserved_Room_Type 7 market_segment_type_Complementary market_segment_type_Corporate market_segment_type_Offline market_segment_type_Online
0 2 0 0 224 2017 10 2 0 0 0 65.00 0 3 0 0 0 0 0 0 0 0 0 0 0 1 0
1 2 0 0 5 2018 11 6 0 0 0 106.68 1 5 0 0 1 0 0 0 0 0 0 0 0 0 1
2 1 0 0 1 2018 2 28 0 0 0 60.00 0 3 0 0 0 0 0 0 0 0 0 0 0 0 1
3 2 0 0 211 2018 5 20 0 0 0 100.00 0 2 0 0 0 0 0 0 0 0 0 0 0 0 1
4 2 0 0 48 2018 4 11 0 0 0 94.50 0 2 0 0 1 0 0 0 0 0 0 0 0 0 1
In [112]:
X_train, X_test, y_train, y_test = train_test_split(
    X, Y, test_size=0.30, random_state=1, stratify=Y
)
In [113]:
# Display the first few rows of X_train
print(X_train.head())
       no_of_adults  no_of_children  required_car_parking_space  lead_time  \
6870              2               0                           0          5   
531               2               1                           0         86   
3394              1               0                           0        105   
23540             1               0                           0         85   
15302             2               0                           0        309   

       arrival_year  arrival_month  arrival_date  repeated_guest  \
6870           2018             12            30               0   
531            2018             12             8               0   
3394           2018              5             5               0   
23540          2018             12             3               0   
15302          2018              5            13               0   

       no_of_previous_cancellations  no_of_previous_bookings_not_canceled  \
6870                              0                                     0   
531                               0                                     0   
3394                              0                                     0   
23540                             0                                     0   
15302                             0                                     0   

       avg_price_per_room  no_of_special_requests  total_nights  \
6870               116.00                       1             5   
531                122.00                       0             3   
3394               117.30                       0             3   
23540               98.00                       0             2   
15302              101.00                       0             3   

       type_of_meal_plan_Meal Plan 2  type_of_meal_plan_Meal Plan 3  \
6870                               0                              0   
531                                0                              0   
3394                               0                              0   
23540                              0                              0   
15302                              1                              0   

       type_of_meal_plan_Not Selected  room_type_reserved_Room_Type 2  \
6870                                0                               0   
531                                 0                               0   
3394                                0                               0   
23540                               0                               0   
15302                               0                               0   

       room_type_reserved_Room_Type 3  room_type_reserved_Room_Type 4  \
6870                                0                               0   
531                                 0                               0   
3394                                0                               0   
23540                               0                               0   
15302                               0                               0   

       room_type_reserved_Room_Type 5  room_type_reserved_Room_Type 6  \
6870                                0                               0   
531                                 0                               0   
3394                                0                               0   
23540                               0                               0   
15302                               0                               0   

       room_type_reserved_Room_Type 7  market_segment_type_Complementary  \
6870                                0                                  0   
531                                 0                                  0   
3394                                0                                  0   
23540                               0                                  0   
15302                               0                                  0   

       market_segment_type_Corporate  market_segment_type_Offline  \
6870                               0                            0   
531                                0                            0   
3394                               0                            0   
23540                              0                            0   
15302                              0                            1   

       market_segment_type_Online  
6870                            1  
531                             1  
3394                            1  
23540                           1  
15302                           0  
In [114]:
X.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 36275 entries, 0 to 36274
Data columns (total 26 columns):
 #   Column                                Non-Null Count  Dtype  
---  ------                                --------------  -----  
 0   no_of_adults                          36275 non-null  int64  
 1   no_of_children                        36275 non-null  int64  
 2   required_car_parking_space            36275 non-null  int64  
 3   lead_time                             36275 non-null  int64  
 4   arrival_year                          36275 non-null  int64  
 5   arrival_month                         36275 non-null  int64  
 6   arrival_date                          36275 non-null  int64  
 7   repeated_guest                        36275 non-null  int64  
 8   no_of_previous_cancellations          36275 non-null  int64  
 9   no_of_previous_bookings_not_canceled  36275 non-null  int64  
 10  avg_price_per_room                    36275 non-null  float64
 11  no_of_special_requests                36275 non-null  int64  
 12  total_nights                          36275 non-null  int64  
 13  type_of_meal_plan_Meal Plan 2         36275 non-null  int64  
 14  type_of_meal_plan_Meal Plan 3         36275 non-null  int64  
 15  type_of_meal_plan_Not Selected        36275 non-null  int64  
 16  room_type_reserved_Room_Type 2        36275 non-null  int64  
 17  room_type_reserved_Room_Type 3        36275 non-null  int64  
 18  room_type_reserved_Room_Type 4        36275 non-null  int64  
 19  room_type_reserved_Room_Type 5        36275 non-null  int64  
 20  room_type_reserved_Room_Type 6        36275 non-null  int64  
 21  room_type_reserved_Room_Type 7        36275 non-null  int64  
 22  market_segment_type_Complementary     36275 non-null  int64  
 23  market_segment_type_Corporate         36275 non-null  int64  
 24  market_segment_type_Offline           36275 non-null  int64  
 25  market_segment_type_Online            36275 non-null  int64  
dtypes: float64(1), int64(25)
memory usage: 7.2 MB
In [115]:
y_train.value_counts()
Out[115]:
booking_status
0    17073
1     8319
Name: count, dtype: int64
In [116]:
# checking the shape of the the train and test data
print("Number of rows in train data =", X_train.shape[0])
print("Number of rows in test data =", X_test.shape[0])
Number of rows in train data = 25392
Number of rows in test data = 10883
In [117]:
# adding constant to the train data
X_train1 = sm.add_constant(X_train)
# adding constant to the test data
X_test1 = sm.add_constant(X_test)
In [118]:
print("{0:0.2f}% data is in training set".format((len(X_train1)/len(df.index)) * 100))
print("{0:0.2f}% data is in test set".format((len(X_test1)/len(df.index)) * 100))
70.00% data is in training set
30.00% data is in test set
In [119]:
print("Shape of Training set : ", X_train1.shape)
print()
print("Shape of test set : ", X_test1.shape)
print()
print("Percentage of classes in training set:")
print()
print(y_train.value_counts(normalize=True))
print()
print("Percentage of classes in test set:")
print()
print(y_test.value_counts(normalize=True))
Shape of Training set :  (25392, 27)

Shape of test set :  (10883, 27)

Percentage of classes in training set:

booking_status
0   0.67
1   0.33
Name: proportion, dtype: float64

Percentage of classes in test set:

booking_status
0   0.67
1   0.33
Name: proportion, dtype: float64

We had seen that around 67.2% of observations belongs to class 0 (Not Canceled) and 32.87% observations belongs to class 1 (Cancellation), and this is preserved in the train and test sets

Building a Logistic Regression model¶

In [120]:
# defining a function to compute different metrics to check performance of a classification model built using statsmodels
def model_performance_classification_statsmodels(
    model, predictors, target, threshold=0.5
):
    """
    Function to compute different metrics to check classification model performance

    model: classifier
    predictors: independent variables
    target: dependent variable
    threshold: threshold for classifying the observation as class 1
    """

    # checking which probabilities are greater than threshold
    pred_temp = model.predict(predictors) > threshold
    # rounding off the above values to get classes
    pred = np.round(pred_temp)

    acc = accuracy_score(target, pred)  # to compute Accuracy
    recall = recall_score(target, pred)  # to compute Recall
    precision = precision_score(target, pred)  # to compute Precision
    f1 = f1_score(target, pred)  # to compute F1-score

    # creating a dataframe of metrics
    df_perf = pd.DataFrame(
        {"Accuracy": acc, "Recall": recall, "Precision": precision, "F1": f1,},
        index=[0],
    )

    return df_perf
In [121]:
# defining a function to plot the confusion_matrix of a classification model

def confusion_matrix_statsmodels(model, predictors, target, threshold=0.5):
    """
    To plot the confusion_matrix with percentages

    model: classifier
    predictors: independent variables
    target: dependent variable
    threshold: threshold for classifying the observation as class 1
    """
    y_pred = model.predict(predictors) > threshold
    cm = confusion_matrix(target, y_pred)
    labels = np.asarray(
        [
            ["{0:0.0f}".format(item) + "\n{0:.2%}".format(item / cm.flatten().sum())]
            for item in cm.flatten()
        ]
    ).reshape(2, 2)

    plt.figure(figsize=(6,4))
    sns.heatmap(cm, annot=labels, fmt="", cmap='nipy_spectral')
    plt.ylabel("True label")
    plt.xlabel("Predicted label")
In [122]:
def confusion_matrix_sklearn(model, predictors, target):
    """
    To plot the confusion_matrix with percentages

    model: classifier
    predictors: independent variables
    target: dependent variable
    """
    y_pred = model.predict(predictors)
    cm = confusion_matrix(target, y_pred)
    labels = np.asarray(
        [
            ["{0:0.0f}".format(item) + "\n{0:.2%}".format(item / cm.flatten().sum())]
            for item in cm.flatten()
        ]
    ).reshape(2, 2)

    plt.figure(figsize=(6, 4))
    sns.heatmap(cm, annot=labels, fmt="", cmap='viridis')
    plt.ylabel("True label")
    plt.xlabel("Predicted label")
In [123]:
## Function to create confusion matrix
def make_confusion_matrix(model,y_actual,labels=[1, 0]):
    '''
    model : classifier to predict values of X
    y_actual : ground truth

    '''
    y_predict = model.predict(X_test)
    cm=metrics.confusion_matrix( y_actual, y_predict, labels=[0, 1])
    df_cm = pd.DataFrame(cm, index = [i for i in ["Actual - No","Actual - Yes"]],
                  columns = [i for i in ['Predicted - No','Predicted - Yes']])
    group_counts = ["{0:0.0f}".format(value) for value in
                cm.flatten()]
    group_percentages = ["{0:.2%}".format(value) for value in
                         cm.flatten()/np.sum(cm)]
    labels = [f"{v1}\n{v2}" for v1, v2 in
              zip(group_counts,group_percentages)]
    labels = np.asarray(labels).reshape(2,2)
    plt.figure(figsize = (6,4))
    sns.heatmap(df_cm, annot=labels,fmt='',cmap='viridis')
    plt.ylabel('True label')
    plt.xlabel('Predicted label')
In [124]:
##  Function to calculate f1 score
def get_f1_score(model, predictors, target):
    """
    model: classifier
    predictors: independent variables
    target: dependent variable

    """
    prediction = model.predict(predictors)
    return f1_score(target, prediction)

Model performance evaluation¶

A Model can make a wrong predictions as:

  1. Predicting a person booking a room will not cancel, but they do.
  2. Predicting a person booking a room will cancel, but they do not.

Which case is more important?

  • Both are important:

    • If we anticipate a guest’s cancellation but they don’t cancel, we’ll reassign their room to another guest. Unfortunately, this means we won’t have a room available for them upon arrival, resulting in significant costs for the hotel (due to offering a complimentary upgraded room). Additionally, we risk losing repeat customers and receiving negative reviews.”

    • If we anticipate that a person won’t cancel their reservation, but they end up doing so, we not only miss out on the revenue from their booking but also incur costs for remarketing the room. Additionally, we’ll likely need to rebook the room at a discounted rate."

How to can you reduce these costs i.e maximize True Positives?
Maximimize your F1 score

  • The greater the f1_score the higher the chances of identifying both the classes correctly.
  • We need to reduce both the False Negatives and False Positives
  • fi_score is computed as $$f1\_score = \frac{2 * Precision * Recall}{Precision + Recall}$$

  • The model_performance_classification_statsmodels function will be used to check the model performance of models.

  • The confusion_matrix_statsmodels function will be used to plot confusion matrix.

Classification Model Evaluation Metrics Summary:¶

Summary:

  • True Positive (TP): Correctly identifying positive cases: A true positive would be when the test correctly identifies a booking as not being canceled and it is not canceled.
  • True Negative (TN): Correctly identifying negative cases: A true negative would be when the test correctly identifies booking as being canceled and it is canceled.
  • False Positive (FP): Incorrectly identifying negative cases as positive: a false positive would be when the test incorrectly identifies a booking being not canceled and it is canceled.
  • False Negative (FN): Incorrectly identifying positive cases as negative: A false negative would be when the test incorrectly identifies a booking as being not canceled and it is canceled.
  1. Accuracy: tells us how often the model makes correct predictions out of all predictions. It's like checking how many answers you got right on a test out of all the questions.

  2. Precision: tells us how many of the predicted positive cases were actually positive. It's like asking, "When the model says something is true, how often is it right?" (TP/TP + FP)

  3. Recall: tells us how many of the actual positive cases were predicted correctly by the model. It's like asking, "Out of all the true positive cases, how many did the model find?" (TP/TP+FN)

  4. F1 Score: is a balance between precision and recall. It's useful when you care about both false positives and false negatives. It's like trying to find a sweet spot between "When the model says something is true, how often is it right?" and "Out of all the true positive cases, how many did the model find?"

Building the Logistic Regression model (with Sklearn library)

In [125]:
lg = LogisticRegression(solver="liblinear", random_state=1)
model = lg.fit(X_train1, y_train)

Model performance on training set

In [126]:
# predicting on training set
y_pred_train = lg.predict(X_train1)
In [127]:
print("Test set performance:")
print("Accuracy:", accuracy_score(y_train, y_pred_train))
print("Precision:", precision_score(y_train, y_pred_train))
print("Recall:", recall_score(y_train, y_pred_train))
print("F1:", f1_score(y_train, y_pred_train))
Test set performance:
Accuracy: 0.8064744801512287
Precision: 0.7431795457791744
Recall: 0.6254357494891213
F1: 0.679242819843342

Performance on test set

In [128]:
# predicting on the test set
y_pred_test = lg.predict(X_test1)
In [129]:
print("Test set performance:")
print("Accuracy:", accuracy_score(y_test, y_pred_test))
print("Precision:", precision_score(y_test, y_pred_test))
print("Recall:", recall_score(y_test, y_pred_test))
print("F1:", f1_score(y_test, y_pred_test))
Test set performance:
Accuracy: 0.8014334282826426
Precision: 0.733932733932734
Recall: 0.6180594503645541
F1: 0.671030598264576

The training and testing precision rates are very close. Training is 74.3% and testing is 73.4%.
The f1_score on the train and test sets are comparable, 67.9% compared to 67.1%, which indicates that the model is showing generalized results.

Building the Logistic Regression model (with statsmodels library)

In [130]:
X_train1 = X_train1.astype(float)  # Convert all columns to float
X_train1.dtypes
Out[130]:
const                                   float64
no_of_adults                            float64
no_of_children                          float64
required_car_parking_space              float64
lead_time                               float64
arrival_year                            float64
arrival_month                           float64
arrival_date                            float64
repeated_guest                          float64
no_of_previous_cancellations            float64
no_of_previous_bookings_not_canceled    float64
avg_price_per_room                      float64
no_of_special_requests                  float64
total_nights                            float64
type_of_meal_plan_Meal Plan 2           float64
type_of_meal_plan_Meal Plan 3           float64
type_of_meal_plan_Not Selected          float64
room_type_reserved_Room_Type 2          float64
room_type_reserved_Room_Type 3          float64
room_type_reserved_Room_Type 4          float64
room_type_reserved_Room_Type 5          float64
room_type_reserved_Room_Type 6          float64
room_type_reserved_Room_Type 7          float64
market_segment_type_Complementary       float64
market_segment_type_Corporate           float64
market_segment_type_Offline             float64
market_segment_type_Online              float64
dtype: object
In [131]:
y_train1 = y_train.astype(float)  # Convert all columns to float
y_train1.dtypes
Out[131]:
dtype('float64')
In [132]:
import warnings
from statsmodels.tools.sm_exceptions import ConvergenceWarning
warnings.simplefilter('ignore', ConvergenceWarning)

# fitting logistic regression model
logit = sm.Logit(y_train, X_train1.astype(float))
lg = logit.fit(disp=False)

print(lg.summary())
                           Logit Regression Results                           
==============================================================================
Dep. Variable:         booking_status   No. Observations:                25392
Model:                          Logit   Df Residuals:                    25365
Method:                           MLE   Df Model:                           26
Date:                Fri, 31 May 2024   Pseudo R-squ.:                  0.3316
Time:                        23:59:04   Log-Likelihood:                -10734.
converged:                      False   LL-Null:                       -16060.
Covariance Type:            nonrobust   LLR p-value:                     0.000
========================================================================================================
                                           coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------------------------
const                                 -893.4670    121.193     -7.372      0.000   -1131.002    -655.932
no_of_adults                             0.0383      0.038      1.017      0.309      -0.036       0.112
no_of_children                           0.0851      0.061      1.404      0.160      -0.034       0.204
required_car_parking_space              -1.6099      0.137    -11.751      0.000      -1.878      -1.341
lead_time                                0.0157      0.000     58.887      0.000       0.015       0.016
arrival_year                             0.4414      0.060      7.350      0.000       0.324       0.559
arrival_month                           -0.0477      0.006     -7.349      0.000      -0.060      -0.035
arrival_date                             0.0032      0.002      1.655      0.098      -0.001       0.007
repeated_guest                          -1.9232      0.766     -2.509      0.012      -3.425      -0.421
no_of_previous_cancellations             0.3475      0.101      3.430      0.001       0.149       0.546
no_of_previous_bookings_not_canceled    -1.3496      0.883     -1.529      0.126      -3.080       0.380
avg_price_per_room                       0.0183      0.001     24.736      0.000       0.017       0.020
no_of_special_requests                  -1.4886      0.030    -48.930      0.000      -1.548      -1.429
total_nights                             0.0695      0.010      7.299      0.000       0.051       0.088
type_of_meal_plan_Meal Plan 2            0.1823      0.067      2.728      0.006       0.051       0.313
type_of_meal_plan_Meal Plan 3           12.9000    425.208      0.030      0.976    -820.493     846.293
type_of_meal_plan_Not Selected           0.1967      0.053      3.691      0.000       0.092       0.301
room_type_reserved_Room_Type 2          -0.4199      0.133     -3.150      0.002      -0.681      -0.159
room_type_reserved_Room_Type 3           1.2239      1.884      0.650      0.516      -2.469       4.917
room_type_reserved_Room_Type 4          -0.2730      0.053     -5.120      0.000      -0.378      -0.168
room_type_reserved_Room_Type 5          -0.6731      0.215     -3.135      0.002      -1.094      -0.252
room_type_reserved_Room_Type 6          -0.8439      0.153     -5.532      0.000      -1.143      -0.545
room_type_reserved_Room_Type 7          -1.3645      0.297     -4.594      0.000      -1.947      -0.782
market_segment_type_Complementary      -18.9188    554.615     -0.034      0.973   -1105.944    1068.106
market_segment_type_Corporate           -0.8734      0.276     -3.170      0.002      -1.413      -0.333
market_segment_type_Offline             -1.7715      0.263     -6.723      0.000      -2.288      -1.255
market_segment_type_Online               0.0072      0.261      0.027      0.978      -0.504       0.518
========================================================================================================

There are 25392 observations.
There are 5 P>|z| greater than 0.05. These could be considered significant.
Market Segment complementarty has a -18.9188 coeefficient and type_of_meal_plan_3 os 12.90.

Model Performance Testing¶

In [133]:
print("Training performance:")
model_performance_classification_statsmodels(lg, X_train1, y_train)
Training performance:
Out[133]:
Accuracy Recall Precision F1
0 0.81 0.63 0.74 0.68

Checking Multicollinearity¶

  • In order to make statistical inferences from a logistic regression model, it is important to ensure that there is no multicollinearity present in the data.

  • VIF standards:

     *   If VIF is between 1 and 5, then there is low multicollinearity.
     *   If VIF is between 5 and 10, we say there is moderate multicollinearity.
     *   If VIF is exceeding 10, it shows signs of high multicollinearity.
In [134]:
# defining a function to check VIF
def checking_vif(predictors):
    vif = pd.DataFrame()
    vif["feature"] = predictors.columns

    # calculating VIF for each feature
    vif["VIF"] = [
        variance_inflation_factor(predictors.values, i)
        for i in range(len(predictors.columns))
    ]
    return vif
In [135]:
checking_vif(X_train1).sort_values(by='VIF', ascending=False)
Out[135]:
feature VIF
0 const 39547263.42
26 market_segment_type_Online 69.47
25 market_segment_type_Offline 62.51
24 market_segment_type_Corporate 16.63
23 market_segment_type_Complementary 4.35
11 avg_price_per_room 2.03
2 no_of_children 2.01
21 room_type_reserved_Room_Type 6 1.99
8 repeated_guest 1.75
10 no_of_previous_bookings_not_canceled 1.57
5 arrival_year 1.43
4 lead_time 1.40
19 room_type_reserved_Room_Type 4 1.36
1 no_of_adults 1.34
9 no_of_previous_cancellations 1.32
16 type_of_meal_plan_Not Selected 1.28
6 arrival_month 1.28
14 type_of_meal_plan_Meal Plan 2 1.26
12 no_of_special_requests 1.25
17 room_type_reserved_Room_Type 2 1.09
13 total_nights 1.09
22 room_type_reserved_Room_Type 7 1.09
3 required_car_parking_space 1.03
20 room_type_reserved_Room_Type 5 1.03
15 type_of_meal_plan_Meal Plan 3 1.01
7 arrival_date 1.01
18 room_type_reserved_Room_Type 3 1.00

Observations:

Some of the market segment dummy variables are showing higher than 5 the rest of the variables are all below 5.
Appears there is no multicollinearity so our assumption is satisfied.
Need to check p-values of predictor variables to check for significance. Need to check if dropping any variables cause the p-value to change

In [136]:
# running a loop to drop variables with high p-value

# initial list of columns
cols = X_train1.columns.tolist()

# setting an initial max p-value
max_p_value = 1

while len(cols) > 0:
    # defining the train set
    X_train_aux = X_train1[cols]

    # fitting the model
    model = sm.Logit(y_train, X_train_aux).fit(disp=False)

    # getting the p-values and the maximum p-value
    p_values = model.pvalues
    max_p_value = max(p_values)

    # name of the variable with maximum p-value
    feature_with_p_max = p_values.idxmax()

    if max_p_value > 0.05:
        cols.remove(feature_with_p_max)
    else:
        break

selected_features = cols
print(selected_features)
['const', 'required_car_parking_space', 'lead_time', 'arrival_year', 'arrival_month', 'repeated_guest', 'no_of_previous_cancellations', 'avg_price_per_room', 'no_of_special_requests', 'total_nights', 'type_of_meal_plan_Meal Plan 2', 'type_of_meal_plan_Not Selected', 'room_type_reserved_Room_Type 2', 'room_type_reserved_Room_Type 4', 'room_type_reserved_Room_Type 5', 'room_type_reserved_Room_Type 6', 'room_type_reserved_Room_Type 7', 'market_segment_type_Corporate', 'market_segment_type_Offline']
In [137]:
X_train2 = X_train1[selected_features]
In [138]:
logit2 = sm.Logit(y_train, X_train2.astype(float))
lg2 = logit2.fit(disp=False)
print(lg2.summary())
                           Logit Regression Results                           
==============================================================================
Dep. Variable:         booking_status   No. Observations:                25392
Model:                          Logit   Df Residuals:                    25373
Method:                           MLE   Df Model:                           18
Date:                Fri, 31 May 2024   Pseudo R-squ.:                  0.3306
Time:                        23:59:07   Log-Likelihood:                -10751.
converged:                       True   LL-Null:                       -16060.
Covariance Type:            nonrobust   LLR p-value:                     0.000
==================================================================================================
                                     coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------------------
const                           -875.3634    120.780     -7.248      0.000   -1112.088    -638.639
required_car_parking_space        -1.6098      0.137    -11.762      0.000      -1.878      -1.342
lead_time                          0.0158      0.000     59.844      0.000       0.015       0.016
arrival_year                       0.4325      0.060      7.225      0.000       0.315       0.550
arrival_month                     -0.0494      0.006     -7.644      0.000      -0.062      -0.037
repeated_guest                    -3.0704      0.600     -5.117      0.000      -4.246      -1.894
no_of_previous_cancellations       0.2899      0.077      3.743      0.000       0.138       0.442
avg_price_per_room                 0.0189      0.001     26.505      0.000       0.017       0.020
no_of_special_requests            -1.4831      0.030    -49.225      0.000      -1.542      -1.424
total_nights                       0.0713      0.009      7.514      0.000       0.053       0.090
type_of_meal_plan_Meal Plan 2      0.1749      0.067      2.619      0.009       0.044       0.306
type_of_meal_plan_Not Selected     0.2077      0.053      3.936      0.000       0.104       0.311
room_type_reserved_Room_Type 2    -0.3760      0.129     -2.912      0.004      -0.629      -0.123
room_type_reserved_Room_Type 4    -0.2716      0.052     -5.265      0.000      -0.373      -0.170
room_type_reserved_Room_Type 5    -0.6792      0.214     -3.175      0.002      -1.098      -0.260
room_type_reserved_Room_Type 6    -0.7378      0.120     -6.161      0.000      -0.972      -0.503
room_type_reserved_Room_Type 7    -1.3168      0.291     -4.522      0.000      -1.887      -0.746
market_segment_type_Corporate     -0.8955      0.103     -8.684      0.000      -1.098      -0.693
market_segment_type_Offline       -1.7803      0.052    -34.463      0.000      -1.882      -1.679
==================================================================================================

No p-value is greater than 0.05.

Coefficients

Positive - lead_time, arrival_year, no_of_previous_cancellations, avg_price_per_room, total_nights, type_of_meal_plan_Not Selected,type_of_meal_plan_Meal Plan 2

Negative - required_car_parking_space, arrival_month, repeated_guest, no_of_special_requests, room_type_reserved_Room Type 2, room_type_reserved_Room Type 4, room_type_reserved_Room Type 5, room_type_reserved_Room Type 6, room_type_reserved_Room Type 7, market_segment_type_Corporate,market_segment_type_Offline

Positive - means an increase in the variable will lead to an increase in the chance of a booking being canceled.

Negative - means a decrease in the variable will lead to an decrease in the chance of a booking being canceled.

Coefficients needs to be converted to odds

In logistic regression, the coefficients represent the logarithm of the odds. To obtain the actual odds, we need to take the exponential of these coefficients.

odds = exp(b)

Percent change in odds is odds=(exp(b)-1)*100

Final Model Summary¶

Since all variables in lg2 have a p-value less than 0.05 we can consider that our final model.

In [139]:
# The purpose of converting the coefficients into a probability (unlike linear regression), it's because the logistic regression model estimates log(odds) as a linear function of the predictor variables.
#Since the coefficients in the logistic regression model represent the change in log(odds), we need to exponentiate them to interpret them in terms of odds.

# converting coefficients to odds
odds = np.exp(lg2.params)

# finding the percentage change
perc_change_odds = (np.exp(lg2.params) - 1) * 100

# adding the odds to a dataframe
pd.DataFrame({"Odds": odds, "Change_odds": perc_change_odds}, index=X_train2.columns).sort_values(by='Change_odds')
Out[139]:
Odds Change_odds
const 0.00 -100.00
repeated_guest 0.05 -95.36
market_segment_type_Offline 0.17 -83.14
required_car_parking_space 0.20 -80.01
no_of_special_requests 0.23 -77.31
room_type_reserved_Room_Type 7 0.27 -73.20
market_segment_type_Corporate 0.41 -59.16
room_type_reserved_Room_Type 6 0.48 -52.18
room_type_reserved_Room_Type 5 0.51 -49.30
room_type_reserved_Room_Type 2 0.69 -31.34
room_type_reserved_Room_Type 4 0.76 -23.78
arrival_month 0.95 -4.82
lead_time 1.02 1.59
avg_price_per_room 1.02 1.91
total_nights 1.07 7.39
type_of_meal_plan_Meal Plan 2 1.19 19.12
type_of_meal_plan_Not Selected 1.23 23.08
no_of_previous_cancellations 1.34 33.62
arrival_year 1.54 54.11

Top 5 Coefficients that will cause a negative change:

  1. When all other factors remain constant, being a repeated guest reduces the odds of a booking being canceled by 95.36%.
  2. When all other factors remain constant, being a guest that books a room offline reduces the odds of a booking being canceled by 83.14%.
  3. When all other factors remain constant, being a guest that requires a car parking space reduces the odds of a booking being canceled by 80.01%.
  4. When all other factors remain constant, being a guest that has a special request reduces the odds of a booking being canceled by 77.31%.
  5. When all other factors remain constant, being a guest that reserves room type 7 reduces the odds of a booking being canceled by 73.20%.

Top 5 Coefficients that will cause a positive change:

  1. When all other factors remain constant, the arrival year increases the odds of a booking being canceled by 54.11%.
  2. When all other factors remain constant, the number of previous cancellations increases the odds of a booking being canceled by 33.62%.
  3. When all other factors remain constant, not selecting a meal plan increases the odds of a booking being canceled by 23.08%.
  4. When all other factors remain constant, selecting meal plan 2 increases the odds of a booking being canceled by 19.12%.
  5. When all other factors remain constant, the total nights booked increases the odds of a booking being canceled by 7.39%.

Logistic Regression model performance evaluation¶

Model Performance on final training set

In [140]:
# creating confusion matrix
confusion_matrix_statsmodels(lg2, X_train2, y_train)

True Positive - 15235
True Negative - 5265
False Positive - 1838
False Negative - 1838

In [141]:
log_reg_model_train_perf = model_performance_classification_statsmodels(lg2, X_train2, y_train)

print("Training performance:")
log_reg_model_train_perf
Training performance:
Out[141]:
Accuracy Recall Precision F1
0 0.81 0.63 0.74 0.68

This is showing a F1 score of 0.68.

In [142]:
X_test2 = X_test1[list(X_train2.columns)]
In [143]:
vif_series = pd.Series(
    [variance_inflation_factor(X_train2.values, i) for i in range(X_train2.shape[1])],
    index=X_train2.columns,
    dtype=float,
)
print("Series before feature selection: \n\n{}\n".format(vif_series))
Series before feature selection: 

const                            39098190.79
required_car_parking_space              1.03
lead_time                               1.36
arrival_year                            1.42
arrival_month                           1.26
repeated_guest                          1.49
no_of_previous_cancellations            1.18
avg_price_per_room                      1.62
no_of_special_requests                  1.22
total_nights                            1.08
type_of_meal_plan_Meal Plan 2           1.25
type_of_meal_plan_Not Selected          1.24
room_type_reserved_Room_Type 2          1.03
room_type_reserved_Room_Type 4          1.27
room_type_reserved_Room_Type 5          1.02
room_type_reserved_Room_Type 6          1.25
room_type_reserved_Room_Type 7          1.03
market_segment_type_Corporate           1.41
market_segment_type_Offline             1.56
dtype: float64

In [144]:
# creating confusion matrix
confusion_matrix_statsmodels(lg2, X_test2, y_test)
In [145]:
log_reg_model_test_perf = model_performance_classification_statsmodels(
    lg2, X_test2, y_test
)

print("Test performance:")
log_reg_model_test_perf
Test performance:
Out[145]:
Accuracy Recall Precision F1
0 0.80 0.63 0.74 0.68
  • The training is showing and F1 score of 0.68 and the testing is showing an F1 score of 0.68
  • As the train and test performances are comparable, the model is not overfitting.
  • Moving forward we will try to improve the performance of the model since 0.68 is not very high.

Model Performance Improvement¶

  • Will try to improve the F1 score by changing the model threshold
  • Check the ROC curve, compute the area under the ROC curve (ROC-AUC). This will help find the optimal threshold needed.
  • Will analyze the Precision-Recall curve to strike the right balance between precision and recall, given that the F1 score serves as a
    harmonic balance between these two metrics

ROC Curve and ROC-AUC¶

ROC Curve:

  • What It Is: The ROC curve is a graphical representation that shows the performance of a binary classification model across different threshold settings.

  • Simple Explanation: Imagine you have a model that predicts whether an email is spam or not. The ROC curve tells you how well your model can distinguish between spam and non-spam emails as you adjust the threshold for classifying an email as spam.

  • X-axis: False Positive Rate (FPR) - It represents the ratio of false positive predictions (predicting spam when it's not) to all actual negative instances.

  • Y-axis: True Positive Rate (TPR) - It represents the ratio of true positive predictions (correctly predicting spam) to all actual positive instances.

  • Plotting Points: The ROC curve is generated by plotting TPR against FPR for various threshold settings.

  • Interpretation: ROC-AUC ranges from 0 to 1, where 1 represents a perfect model (all true positives, no false positives), and 0.5 represents a random model (no discrimination between classes).

  • Comparing Models: You can use ROC-AUC to compare different models. The model with a higher ROC-AUC value is generally considered to be better at distinguishing between the classes.

ROC-AUC (Training(

In [146]:
logit_roc_auc_train = roc_auc_score(y_train, lg2.predict(X_train2))
fpr, tpr, thresholds = roc_curve(y_train, lg2.predict(X_train2))
plt.figure(figsize=(7, 5))
plt.plot(fpr, tpr, label="Logistic Regression (area = %0.2f)" % logit_roc_auc_train)
plt.plot([0, 1], [0, 1], "r--")
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("Receiver operating characteristic")
plt.legend(loc="lower right")
plt.show()
Out[146]:
<Figure size 700x500 with 0 Axes>
Out[146]:
[<matplotlib.lines.Line2D at 0x7cf381740760>]
Out[146]:
[<matplotlib.lines.Line2D at 0x7cf381740670>]
Out[146]:
(0.0, 1.0)
Out[146]:
(0.0, 1.05)
Out[146]:
Text(0.5, 0, 'False Positive Rate')
Out[146]:
Text(0, 0.5, 'True Positive Rate')
Out[146]:
Text(0.5, 1.0, 'Receiver operating characteristic')
Out[146]:
<matplotlib.legend.Legend at 0x7cf38131da80>

Based on the logistic regression area of 0.86 it appears the model is performing well.

Optimal threshold using AUC-ROC curve¶

In [147]:
# Optimal threshold as per AUC-ROC curve
# The optimal cut off would be where tpr is high and fpr is low
fpr, tpr, thresholds = roc_curve(y_train, lg2.predict(X_train2))

optimal_idx = np.argmax(tpr - fpr)
optimal_threshold_auc_roc = thresholds[optimal_idx]
print(optimal_threshold_auc_roc)
0.34049961761164615
In [148]:
# creating confusion matrix
confusion_matrix_statsmodels(
    lg2, X_train2, y_train, threshold=optimal_threshold_auc_roc
)
In [149]:
# checking model performance for this model
log_reg_model_train_perf_threshold_auc_roc = model_performance_classification_statsmodels(
    lg2, X_train2, y_train, threshold=optimal_threshold_auc_roc
)
print("Training performance:")
log_reg_model_train_perf_threshold_auc_roc
Training performance:
Out[149]:
Accuracy Recall Precision F1
0 0.79 0.76 0.65 0.70

Accuracy - decreased by 0.01
Recall - increased by 0.13
Precision - decreased by 0.09
F1 score - increased by .02

Since the recall and F1 score increased this model with this threshold is more useful for INN Hotel's intended use case.

Model performance on test set

In [150]:
logit_roc_auc_test = roc_auc_score(y_test, lg2.predict(X_test2))
fpr, tpr, thresholds = roc_curve(y_test, lg2.predict(X_test2))
plt.figure(figsize=(7, 5))
plt.plot(fpr, tpr, label="Logistic Regression (area = %0.2f)" % logit_roc_auc_test)
plt.plot([0, 1], [0, 1], "r--")
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("Receiver operating characteristic")
plt.legend(loc="lower right")
plt.show()
Out[150]:
<Figure size 700x500 with 0 Axes>
Out[150]:
[<matplotlib.lines.Line2D at 0x7cf37a441570>]
Out[150]:
[<matplotlib.lines.Line2D at 0x7cf37a443c70>]
Out[150]:
(0.0, 1.0)
Out[150]:
(0.0, 1.05)
Out[150]:
Text(0.5, 0, 'False Positive Rate')
Out[150]:
Text(0, 0.5, 'True Positive Rate')
Out[150]:
Text(0.5, 1.0, 'Receiver operating characteristic')
Out[150]:
<matplotlib.legend.Legend at 0x7cf37bfa90f0>
In [151]:
# creating confusion matrix
confusion_matrix_statsmodels(lg2, X_test2, y_test, threshold=optimal_threshold_auc_roc)
In [152]:
# checking model performance for this model
log_reg_model_test_perf_threshold_auc_roc = model_performance_classification_statsmodels(
    lg2, X_test2, y_test, threshold=optimal_threshold_auc_roc
)
print("Test performance:")
log_reg_model_test_perf_threshold_auc_roc
Test performance:
Out[152]:
Accuracy Recall Precision F1
0 0.78 0.76 0.64 0.69

The test performs almost the same as the training.
Accuracy - same
Recall - same
Precision - decrease 0.01
F1 score - decrease of 0.01

Optimal threshold using the Precision-Recall curve

In [153]:
y_scores = lg2.predict(X_train2)
prec, rec, tre = precision_recall_curve(y_train, y_scores,)


def plot_prec_recall_vs_tresh(precisions, recalls, thresholds):
    plt.plot(thresholds, precisions[:-1], "b--", label="precision")
    plt.plot(thresholds, recalls[:-1], "g--", label="recall")
    plt.xlabel("Threshold")
    plt.legend(loc="upper left")
    plt.ylim([0, 1])

plt.figure(figsize=(10, 7))
plot_prec_recall_vs_tresh(prec, rec, tre)
plt.show()
Out[153]:
<Figure size 1000x700 with 0 Axes>

At the threshold of ~ 0.42, we get balanced recall and precision.

In [154]:
print(tre)
[4.36283975e-07 1.76602177e-06 3.18008213e-06 ... 9.95935711e-01
 9.97452510e-01 9.98159837e-01]
In [155]:
# setting the threshold
optimal_threshold_curve = 0.42

Model Performance on Training Set

In [156]:
# creating confusion matrix
confusion_matrix_statsmodels(lg2, X_train2, y_train, threshold=optimal_threshold_curve)
In [157]:
log_reg_model_train_perf_threshold_curve = model_performance_classification_statsmodels(
    lg2, X_train2, y_train, threshold=optimal_threshold_curve
)
print("Training performance:")
log_reg_model_train_perf_threshold_curve
Training performance:
Out[157]:
Accuracy Recall Precision F1
0 0.80 0.70 0.70 0.70

Model is performing well on the training set.

Threshold went from 0.34 to 0.42.

Accuracy - increased by 0.01
Recall - decreased by 0.06
Precision - increased by 0.05
F1 Score - stayed the same.

Although accuracy and precision increased, recall decreased and F1 score stayed the same.
Original threshold is better on recall.

In [158]:
# creating confusion matrix
confusion_matrix_statsmodels(lg2, X_test2, y_test, threshold=optimal_threshold_curve)
In [159]:
log_reg_model_test_perf_threshold_curve = model_performance_classification_statsmodels(
    lg2, X_test2, y_test, threshold=optimal_threshold_curve
)
print("Test performance:")
log_reg_model_test_perf_threshold_curve
Test performance:
Out[159]:
Accuracy Recall Precision F1
0 0.80 0.69 0.69 0.69

Model is performing well on the testing set.

Threshold went from 0.34 to 0.42.

Accuracy - increased by 0.02
Recall - decreased by 0.07
Precision - increased by 0.05
F1 Score - stayed the same.

Although accuracy and precision increased, recall decreased and F1 score stayed the same.
Original threshold is better on recall. Match almost perfectly to the training model.

Logistic Regression model summary

In [160]:
# training performance comparison

models_train_comp_df = pd.concat(
    [
        log_reg_model_train_perf.T,
        log_reg_model_train_perf_threshold_auc_roc.T,
        log_reg_model_train_perf_threshold_curve.T,
    ],
    axis=1,
)
models_train_comp_df.columns = [
    "Logistic Regression-sklearn",
    "Logistic Regression-0.34 Threshold",
    "Logistic Regression-0.42 Threshold",
]

print("Training performance comparison:")
models_train_comp_df
Training performance comparison:
Out[160]:
Logistic Regression-sklearn Logistic Regression-0.34 Threshold Logistic Regression-0.42 Threshold
Accuracy 0.81 0.79 0.80
Recall 0.63 0.76 0.70
Precision 0.74 0.65 0.70
F1 0.68 0.70 0.70
In [161]:
# testing performance comparison

models_test_comp_df = pd.concat(
    [
        log_reg_model_test_perf.T,
        log_reg_model_test_perf_threshold_auc_roc.T,
        log_reg_model_test_perf_threshold_curve.T,
    ],
    axis=1,
)
models_test_comp_df.columns = [
    "Logistic Regression-sklearn",
    "Logistic Regression-0.34 Threshold",
    "Logistic Regression-0.42 Threshold",
]

print("Test set performance comparison:")
models_test_comp_df
Test set performance comparison:
Out[161]:
Logistic Regression-sklearn Logistic Regression-0.34 Threshold Logistic Regression-0.42 Threshold
Accuracy 0.80 0.78 0.80
Recall 0.63 0.76 0.69
Precision 0.74 0.64 0.69
F1 0.68 0.69 0.69

Conclusions

  • Almost all the three models are performing well on both training and test data without the problem of overfitting.
  • Similiar results are being achieved on the training and testing set.
  • The model with a threshold (0.34) and (0.42) both are showing the same F1 score. However the (0.34) threshold has a higher Recall. Therefore it can be selected as the final model.
  • INN Hotels can utilize this model to forecast booking cancellations, achieving an F1 score of 0.70 on the training set and 0.69 on the test set.

Top 5 Coefficients that will cause a negative change:

  1. When all other factors remain constant, being a repeated guest reduces the odds of a booking being canceled by 95.36%.
  2. When all other factors remain constant, being a guest that books a room offline reduces the odds of a booking being canceled by 83.14%.
  3. When all other factors remain constant, being a guest that requires a car parking space reduces the odds of a booking being canceled by 80.01%.
  4. When all other factors remain constant, being a guest that has a special request reduces the odds of a booking being canceled by 77.31%.
  5. When all other factors remain constant, being a guest that reserves room type 7 reduces the odds of a booking being canceled by 73.20%.

Top 5 Coefficients that will cause a positive change:

  1. When all other factors remain constant, the arrival year increases the odds of a booking being canceled by 54.11%.
  2. When all other factors remain constant, the number of previous cancellations increases the odds of a booking being canceled by 33.62%.
  3. When all other factors remain constant, not selecting a meal plan increases the odds of a booking being canceled by 23.08%.
  4. When all other factors remain constant, selecting meal plan 2 increases the odds of a booking being canceled by 19.12%.
  5. When all other factors remain constant, the total nights booked increases the odds of a booking being canceled by 7.39%.

Building a Decision Tree model¶

Decision Tree (default)¶

Hyperparameter:¶

  • A value that is set before the learning process begins and is used to control the behavior of the learning algorithm.
  • This of it like putting specific settings in the washing machine before you clean your clothes (data).
  • Settings can be: water temp, spin speed, soil level, Detergent Dispenser, etc..

criterion: The function to measure the quality of a split ("gini" or "entropy").

Entropy, based on the concept from information theory, measures the amount of disorder or unpredictability in the data at a node. Entropy ranges from 0 (pure node) to 1 (maximally mixed node with equal distribution of classes).

Gini impurity is a measure of how often a randomly chosen element from the set would be incorrectly labeled if it was randomly labeled according to the distribution of labels in the subset. Range: The Gini impurity ranges from 0 (pure node) to 0.5 (evenly mixed classes in the node).

Preference: Gini impurity is typically preferred for its computational efficiency, but entropy is sometimes chosen for its stronger theoretical foundation from information theory.

splitter: The strategy used to choose the split at each node ("best" or "random").

max_depth: The maximum depth of the tree (None for no limit).

min_samples_split: The minimum number of samples required to split an internal node (int or float).

min_samples_leaf: The minimum number of samples required to be at a leaf node (int or float).

min_weight_fraction_leaf: The minimum weighted fraction of the sum total of weights required to be at a leaf node.

max_features: The number of features to consider when looking for the best split (int, float, string, or None).

random_state: Controls the randomness of the estimator (int, RandomState instance, or None).

max_leaf_nodes: Grow a tree with the maximum number of leaf nodes (None for unlimited).

min_impurity_decrease: A node will be split if this split induces a decrease of the impurity greater than or equal to this value.

class_weight: Weights associated with classes (dict, list of dicts, "balanced", or None).

ccp_alpha: Complexity parameter used for Minimal Cost-Complexity Pruning (non-negative float).

In [162]:
# Create a DecisionTreeClassifier with all parameters specified
clf_example = DecisionTreeClassifier(
    criterion='entropy',             # Measure quality of split using 'entropy' 'Gini' (Measure of randomness or disorder in a set of data)
    splitter='random',               # Use random best split
    max_depth=3,                     # Maximum depth of the tree is 5
    min_samples_split=4,             # Minimum 4 samples required to split an internal node
    min_samples_leaf=3,              # Minimum 2 samples required to be at a leaf node
    min_weight_fraction_leaf=0.01,   # Minimum weighted fraction of sum total of weights required at a leaf node
    max_features='sqrt',             # Number of features to consider when looking for the best split is the square root of total features
    random_state=44,                 # Control randomness of the estimator
    max_leaf_nodes=15,               # Maximum number of leaf nodes is 10
    min_impurity_decrease=0.01,      # A node will be split if this split induces a decrease in impurity greater than or equal to this value
    class_weight='balanced',         # Adjust weights inversely proportional to class frequencies in the input data
    ccp_alpha=0.01                   # Complexity parameter used for Minimal Cost-Complexity Pruning
)
In [163]:
df3["booking_status"].value_counts()
Out[163]:
booking_status
Not_Canceled    24390
Canceled        11885
Name: count, dtype: int64
In [164]:
df3["booking_status"] = df3["booking_status"].apply(lambda x: 1 if x == "Canceled" else 0)
In [165]:
#resplit data for the decision tree model
X = df3.drop(["booking_status",], axis=1)
Y = df3["booking_status"]

# Identify object-type columns
object_cols = X.select_dtypes(include=['object','category']).columns

# Convert object-type columns to dummy variables
X = pd.get_dummies(X, columns=object_cols, dtype=int, drop_first=True)  # Drop the first category to avoid multicollinearity
# Ensure te output is integer (numeric 0 and 1) instead of Boolean

# Splitting data in train and test sets
X_train, X_test, y_train, y_test = train_test_split(
    X, Y, test_size=0.30, random_state=1
)
In [166]:
Y.unique()
Out[166]:
array([0, 1])
In [167]:
X_train.head()
Out[167]:
no_of_adults no_of_children required_car_parking_space lead_time arrival_year arrival_month arrival_date repeated_guest no_of_previous_cancellations no_of_previous_bookings_not_canceled avg_price_per_room no_of_special_requests total_nights type_of_meal_plan_Meal Plan 2 type_of_meal_plan_Meal Plan 3 type_of_meal_plan_Not Selected room_type_reserved_Room_Type 2 room_type_reserved_Room_Type 3 room_type_reserved_Room_Type 4 room_type_reserved_Room_Type 5 room_type_reserved_Room_Type 6 room_type_reserved_Room_Type 7 market_segment_type_Complementary market_segment_type_Corporate market_segment_type_Offline market_segment_type_Online
13662 1 0 0 163 2018 10 15 0 0 0 115.00 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0
26641 2 0 0 113 2018 3 31 0 0 0 78.15 1 3 0 0 0 1 0 0 0 0 0 0 0 0 1
17835 2 0 0 359 2018 10 14 0 0 0 78.00 1 5 0 0 0 0 0 0 0 0 0 0 0 1 0
21485 2 0 0 136 2018 6 29 0 0 0 85.50 0 3 0 0 1 0 0 0 0 0 0 0 0 0 1
5670 2 0 0 21 2018 8 15 0 0 0 151.00 0 3 0 0 0 0 0 0 0 0 0 0 0 0 1

We will construct our model using the DecisionTreeClassifier function. By default, it employs the ‘gini’ criterion to determine how to split the data at each node. Alternatively, you can choose the ‘entropy’ criterion for splitting

In [168]:
X_train = X_train.astype(float)  # Convert all columns to float
X_train.dtypes
Out[168]:
no_of_adults                            float64
no_of_children                          float64
required_car_parking_space              float64
lead_time                               float64
arrival_year                            float64
arrival_month                           float64
arrival_date                            float64
repeated_guest                          float64
no_of_previous_cancellations            float64
no_of_previous_bookings_not_canceled    float64
avg_price_per_room                      float64
no_of_special_requests                  float64
total_nights                            float64
type_of_meal_plan_Meal Plan 2           float64
type_of_meal_plan_Meal Plan 3           float64
type_of_meal_plan_Not Selected          float64
room_type_reserved_Room_Type 2          float64
room_type_reserved_Room_Type 3          float64
room_type_reserved_Room_Type 4          float64
room_type_reserved_Room_Type 5          float64
room_type_reserved_Room_Type 6          float64
room_type_reserved_Room_Type 7          float64
market_segment_type_Complementary       float64
market_segment_type_Corporate           float64
market_segment_type_Offline             float64
market_segment_type_Online              float64
dtype: object
In [169]:
# checking the shape of the the train and test data
print("Number of rows in train data =", X_train.shape[0])
print("Number of rows in test data =", X_test.shape[0])
Number of rows in train data = 25392
Number of rows in test data = 10883

adding constant not needed for decisiontrees
sm.add_constant

In [170]:
print("{0:0.2f}% data is in training set".format((len(X_train)/len(df.index)) * 100))
print("{0:0.2f}% data is in test set".format((len(X_test)/len(df.index)) * 100))
70.00% data is in training set
30.00% data is in test set
In [171]:
#confirm percentage of each class in both training and test datasets
print("Percentage of classes in training set:")
print(y_train.value_counts(normalize=True))
print(' ')
print("Percentage of classes in test set:")
print(y_test.value_counts(normalize=True))
Percentage of classes in training set:
booking_status
0   0.67
1   0.33
Name: proportion, dtype: float64
 
Percentage of classes in test set:
booking_status
0   0.68
1   0.32
Name: proportion, dtype: float64
In [172]:
model = DecisionTreeClassifier(random_state=1)
model.fit(X_train, y_train)
Out[172]:
DecisionTreeClassifier(random_state=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
DecisionTreeClassifier(random_state=1)

Model Evaluation¶

Model evaluation criterion

Model can make wrong predictions as:

  • Predicting a booking will not be canceled but in reality, the booking is canceled (FN)
  • Predicting a machine will be canceled but in reality, the booking is not canceled (FP)

Which case is more important?

  • If we predict that a booking will not be canceled but in reality, the booking is canceled, then the company will have to bear the cost of a room being empty
  • If we predict that a booking will be canceled but in reality, the booking is not canceled, then the company will have overbooked the room. This could lead to a loss revenue for current booking, bad reputation for canceling a guests booking, future loss revenue as guest will book somewhere else.
  • Typically the loss in revenue would be less if the guest cancels and the company is expecting them not to cancel.

How to reduce the losses?

The company would want the recall to be maximized, the greater the recall score the higher the chances of minimizing the False Negatives.

In [173]:
# defining a function to compute different metrics to check performance of a classification model built using sklearn
def model_performance_classification_sklearn(model, predictors, target):
    """
    Function to compute different metrics to check classification model performance

    model: classifier
    predictors: independent variables
    target: dependent variable
    """

    # predicting using the independent variables
    pred = model.predict(predictors)

    acc = accuracy_score(target, pred)  # to compute Accuracy
    recall = recall_score(target, pred)  # to compute Recall
    precision = precision_score(target, pred)  # to compute Precision
    f1 = f1_score(target, pred)  # to compute F1-score

    # creating a dataframe of metrics
    df_perf = pd.DataFrame(
        {"Accuracy": acc, "Recall": recall, "Precision": precision, "F1": f1,},
        index=[0],
    )

    return df_perf
In [174]:
confusion_matrix_sklearn(model, X_train, y_train)
In [175]:
decision_tree_perf_train_without = model_performance_classification_sklearn(
    model, X_train, y_train
)
decision_tree_perf_train_without
Out[175]:
Accuracy Recall Precision F1
0 0.99 0.99 1.00 0.99

Model is showing an F1 score of 99. It is only misclassifying 147 bookings. However there is probably significant overfitting in the training data.

In [176]:
confusion_matrix_sklearn(model, X_test, y_test)
In [177]:
decision_tree_perf_test_without = model_performance_classification_sklearn(
    model, X_test, y_test
)
decision_tree_perf_test_without
Out[177]:
Accuracy Recall Precision F1
0 0.87 0.80 0.79 0.80

There is a huge difference between the training set and test set. That means there is overfitting.

Decision Tree (with class_weights)¶

  • If the frequency of class A is 10% and the frequency of class B is 90%, then class B will become the dominant class and the decision tree will become biased toward the dominant classes

  • In this case, we will set class_weight = "balanced", which will automatically adjust the weights to be inversely proportional to the class frequencies in the input data

  • class_weight is a hyperparameter for the decision tree classifier

In [178]:
#build the decision tree model
decisiontree = DecisionTreeClassifier(random_state=1, class_weight="balanced")
#fit the model to the training set
decisiontree.fit(X_train, y_train)
#create a confusion matrix
confusion_matrix_sklearn(decisiontree, X_train, y_train)
Out[178]:
DecisionTreeClassifier(class_weight='balanced', random_state=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
DecisionTreeClassifier(class_weight='balanced', random_state=1)
In [179]:
decision_tree_perf_train = model_performance_classification_sklearn(
    decisiontree, X_train, y_train
)
decision_tree_perf_train
Out[179]:
Accuracy Recall Precision F1
0 0.99 1.00 0.98 0.99

Model is only misclassifying 156 bookings. But there is most likely overfitting in the training data.

In [180]:
#create a confusion matrix for the test set
confusion_matrix_sklearn(decisiontree, X_test, y_test)
In [181]:
decision_tree_perf_test = model_performance_classification_sklearn(
    decisiontree, X_test, y_test
)
decision_tree_perf_test
Out[181]:
Accuracy Recall Precision F1
0 0.86 0.81 0.78 0.79

There is a huge difference in the performance of the model on the training set and the test set, this means the model is overfitting.

Visualizing the Decision Tree¶

Do we need to prune the tree?¶

In [182]:
## creating a list of column names
feature_names = X_train.columns.to_list()
feature_names
Out[182]:
['no_of_adults',
 'no_of_children',
 'required_car_parking_space',
 'lead_time',
 'arrival_year',
 'arrival_month',
 'arrival_date',
 'repeated_guest',
 'no_of_previous_cancellations',
 'no_of_previous_bookings_not_canceled',
 'avg_price_per_room',
 'no_of_special_requests',
 'total_nights',
 'type_of_meal_plan_Meal Plan 2',
 'type_of_meal_plan_Meal Plan 3',
 'type_of_meal_plan_Not Selected',
 'room_type_reserved_Room_Type 2',
 'room_type_reserved_Room_Type 3',
 'room_type_reserved_Room_Type 4',
 'room_type_reserved_Room_Type 5',
 'room_type_reserved_Room_Type 6',
 'room_type_reserved_Room_Type 7',
 'market_segment_type_Complementary',
 'market_segment_type_Corporate',
 'market_segment_type_Offline',
 'market_segment_type_Online']
In [183]:
# Text report showing the rules of a decision tree -

print(tree.export_text(decisiontree, feature_names=feature_names, show_weights=True))
|--- lead_time <= 151.50
|   |--- no_of_special_requests <= 0.50
|   |   |--- market_segment_type_Online <= 0.50
|   |   |   |--- lead_time <= 90.50
|   |   |   |   |--- total_nights <= 5.50
|   |   |   |   |   |--- avg_price_per_room <= 201.50
|   |   |   |   |   |   |--- lead_time <= 74.50
|   |   |   |   |   |   |   |--- arrival_month <= 5.50
|   |   |   |   |   |   |   |   |--- arrival_date <= 27.50
|   |   |   |   |   |   |   |   |   |--- lead_time <= 59.50
|   |   |   |   |   |   |   |   |   |   |--- market_segment_type_Offline <= 0.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 16
|   |   |   |   |   |   |   |   |   |   |--- market_segment_type_Offline >  0.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 11
|   |   |   |   |   |   |   |   |   |--- lead_time >  59.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 16.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [19.38, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  16.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 5
|   |   |   |   |   |   |   |   |--- arrival_date >  27.50
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 61.00
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 59.75
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  59.75
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 50.10] class: 1
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  61.00
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 29.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  29.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 5
|   |   |   |   |   |   |   |--- arrival_month >  5.50
|   |   |   |   |   |   |   |   |--- market_segment_type_Offline <= 0.50
|   |   |   |   |   |   |   |   |   |--- repeated_guest <= 0.50
|   |   |   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 4 <= 0.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 18
|   |   |   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 4 >  0.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 8
|   |   |   |   |   |   |   |   |   |--- repeated_guest >  0.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [132.71, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- market_segment_type_Offline >  0.50
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 50.00
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 9.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  9.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [14.17, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  50.00
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 1.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  1.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 15
|   |   |   |   |   |   |--- lead_time >  74.50
|   |   |   |   |   |   |   |--- lead_time <= 78.50
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 79.78
|   |   |   |   |   |   |   |   |   |--- arrival_month <= 3.50
|   |   |   |   |   |   |   |   |   |   |--- no_of_adults <= 1.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |   |   |--- no_of_adults >  1.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [2.24, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- arrival_month >  3.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [12.67, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  79.78
|   |   |   |   |   |   |   |   |   |--- arrival_month <= 3.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 28.84] class: 1
|   |   |   |   |   |   |   |   |   |--- arrival_month >  3.50
|   |   |   |   |   |   |   |   |   |   |--- total_nights <= 2.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |   |--- total_nights >  2.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |--- lead_time >  78.50
|   |   |   |   |   |   |   |   |--- total_nights <= 3.50
|   |   |   |   |   |   |   |   |   |--- market_segment_type_Corporate <= 0.50
|   |   |   |   |   |   |   |   |   |   |--- total_nights <= 2.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [82.01, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- total_nights >  2.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 6
|   |   |   |   |   |   |   |   |   |--- market_segment_type_Corporate >  0.50
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 86.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  86.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |--- total_nights >  3.50
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 24.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 8.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  8.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |--- arrival_date >  24.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 3.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  3.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |--- avg_price_per_room >  201.50
|   |   |   |   |   |   |--- arrival_date <= 28.00
|   |   |   |   |   |   |   |--- weights: [0.00, 25.81] class: 1
|   |   |   |   |   |   |--- arrival_date >  28.00
|   |   |   |   |   |   |   |--- avg_price_per_room <= 240.38
|   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |--- avg_price_per_room >  240.38
|   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |--- total_nights >  5.50
|   |   |   |   |   |--- avg_price_per_room <= 92.80
|   |   |   |   |   |   |--- arrival_date <= 22.50
|   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 5 <= 0.50
|   |   |   |   |   |   |   |   |--- lead_time <= 72.50
|   |   |   |   |   |   |   |   |   |--- lead_time <= 33.00
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 16.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [18.64, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  16.00
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |--- lead_time >  33.00
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 6.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [2.24, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  6.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |--- lead_time >  72.50
|   |   |   |   |   |   |   |   |   |--- weights: [14.91, 0.00] class: 0
|   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 5 >  0.50
|   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |--- arrival_date >  22.50
|   |   |   |   |   |   |   |--- weights: [23.86, 0.00] class: 0
|   |   |   |   |   |--- avg_price_per_room >  92.80
|   |   |   |   |   |   |--- arrival_month <= 8.50
|   |   |   |   |   |   |   |--- arrival_date <= 21.00
|   |   |   |   |   |   |   |   |--- no_of_adults <= 1.50
|   |   |   |   |   |   |   |   |   |--- arrival_month <= 4.50
|   |   |   |   |   |   |   |   |   |   |--- total_nights <= 13.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- total_nights >  13.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |   |   |   |   |   |--- arrival_month >  4.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 74.39] class: 1
|   |   |   |   |   |   |   |   |--- no_of_adults >  1.50
|   |   |   |   |   |   |   |   |   |--- total_nights <= 6.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [1.49, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- total_nights >  6.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 6.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  6.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 4.55] class: 1
|   |   |   |   |   |   |   |--- arrival_date >  21.00
|   |   |   |   |   |   |   |   |--- weights: [5.22, 0.00] class: 0
|   |   |   |   |   |   |--- arrival_month >  8.50
|   |   |   |   |   |   |   |--- weights: [7.46, 0.00] class: 0
|   |   |   |--- lead_time >  90.50
|   |   |   |   |--- lead_time <= 117.50
|   |   |   |   |   |--- avg_price_per_room <= 93.58
|   |   |   |   |   |   |--- avg_price_per_room <= 75.07
|   |   |   |   |   |   |   |--- arrival_month <= 7.50
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 58.75
|   |   |   |   |   |   |   |   |   |--- weights: [10.44, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  58.75
|   |   |   |   |   |   |   |   |   |--- total_nights <= 3.50
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 104.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 6
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  104.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 6
|   |   |   |   |   |   |   |   |   |--- total_nights >  3.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 23.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  23.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |--- arrival_month >  7.50
|   |   |   |   |   |   |   |   |--- arrival_date <= 29.50
|   |   |   |   |   |   |   |   |   |--- type_of_meal_plan_Not Selected <= 0.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 71.12
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  71.12
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |--- type_of_meal_plan_Not Selected >  0.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 6.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [1.49, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  6.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |--- arrival_date >  29.50
|   |   |   |   |   |   |   |   |   |--- lead_time <= 98.00
|   |   |   |   |   |   |   |   |   |   |--- weights: [4.47, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- lead_time >  98.00
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 63.25
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  63.25
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 13.66] class: 1
|   |   |   |   |   |   |--- avg_price_per_room >  75.07
|   |   |   |   |   |   |   |--- arrival_month <= 3.50
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 88.50
|   |   |   |   |   |   |   |   |   |--- total_nights <= 1.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 80.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  80.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [17.15, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- total_nights >  1.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [37.28, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  88.50
|   |   |   |   |   |   |   |   |   |--- market_segment_type_Offline <= 0.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |   |--- market_segment_type_Offline >  0.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |--- arrival_month >  3.50
|   |   |   |   |   |   |   |   |--- arrival_month <= 4.50
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 80.38
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 16.70] class: 1
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  80.38
|   |   |   |   |   |   |   |   |   |   |--- total_nights <= 4.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- total_nights >  4.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- arrival_month >  4.50
|   |   |   |   |   |   |   |   |   |--- no_of_adults <= 1.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 86.00
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 5
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  86.00
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |--- no_of_adults >  1.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 22.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 5
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  22.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 7
|   |   |   |   |   |--- avg_price_per_room >  93.58
|   |   |   |   |   |   |--- arrival_date <= 11.50
|   |   |   |   |   |   |   |--- arrival_month <= 7.50
|   |   |   |   |   |   |   |   |--- lead_time <= 108.50
|   |   |   |   |   |   |   |   |   |--- no_of_adults <= 1.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [2.98, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- no_of_adults >  1.50
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 102.00
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  102.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 4.55] class: 1
|   |   |   |   |   |   |   |   |--- lead_time >  108.50
|   |   |   |   |   |   |   |   |   |--- total_nights <= 2.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [8.95, 1.52] class: 0
|   |   |   |   |   |   |   |   |   |--- total_nights >  2.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [3.73, 0.00] class: 0
|   |   |   |   |   |   |   |--- arrival_month >  7.50
|   |   |   |   |   |   |   |   |--- lead_time <= 110.50
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 116.75
|   |   |   |   |   |   |   |   |   |   |--- weights: [2.98, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  116.75
|   |   |   |   |   |   |   |   |   |   |--- weights: [1.49, 1.52] class: 1
|   |   |   |   |   |   |   |   |--- lead_time >  110.50
|   |   |   |   |   |   |   |   |   |--- total_nights <= 2.00
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 12.14] class: 1
|   |   |   |   |   |   |   |   |   |--- total_nights >  2.00
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 112.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [1.49, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  112.00
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |--- arrival_date >  11.50
|   |   |   |   |   |   |   |--- avg_price_per_room <= 102.09
|   |   |   |   |   |   |   |   |--- arrival_date <= 14.50
|   |   |   |   |   |   |   |   |   |--- weights: [1.49, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- arrival_date >  14.50
|   |   |   |   |   |   |   |   |   |--- arrival_month <= 2.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- arrival_month >  2.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 95.44
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  95.44
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |--- avg_price_per_room >  102.09
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 109.50
|   |   |   |   |   |   |   |   |   |--- total_nights <= 1.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 108.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 16.70] class: 1
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  108.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- total_nights >  1.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 6.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [2.24, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  6.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [31.31, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  109.50
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 124.25
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 19.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 71.35] class: 1
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  19.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  124.25
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 27.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [2.24, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  27.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |--- lead_time >  117.50
|   |   |   |   |   |--- no_of_adults <= 1.50
|   |   |   |   |   |   |--- avg_price_per_room <= 122.00
|   |   |   |   |   |   |   |--- weights: [105.12, 0.00] class: 0
|   |   |   |   |   |   |--- avg_price_per_room >  122.00
|   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |   |--- no_of_adults >  1.50
|   |   |   |   |   |   |--- arrival_month <= 11.50
|   |   |   |   |   |   |   |--- arrival_date <= 7.50
|   |   |   |   |   |   |   |   |--- lead_time <= 150.50
|   |   |   |   |   |   |   |   |   |--- arrival_month <= 5.00
|   |   |   |   |   |   |   |   |   |   |--- weights: [24.60, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- arrival_month >  5.00
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 6.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  6.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [14.17, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- lead_time >  150.50
|   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |--- arrival_date >  7.50
|   |   |   |   |   |   |   |   |--- arrival_date <= 24.50
|   |   |   |   |   |   |   |   |   |--- total_nights <= 3.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 23.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 9
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  23.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |--- total_nights >  3.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 74.12
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 7
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  74.12
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [20.13, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- arrival_date >  24.50
|   |   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 5 <= 0.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 57.25
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  57.25
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 5 >  0.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 4.55] class: 1
|   |   |   |   |   |   |--- arrival_month >  11.50
|   |   |   |   |   |   |   |--- weights: [48.46, 0.00] class: 0
|   |   |--- market_segment_type_Online >  0.50
|   |   |   |--- lead_time <= 13.50
|   |   |   |   |--- avg_price_per_room <= 99.44
|   |   |   |   |   |--- arrival_month <= 1.50
|   |   |   |   |   |   |--- weights: [92.45, 0.00] class: 0
|   |   |   |   |   |--- arrival_month >  1.50
|   |   |   |   |   |   |--- arrival_month <= 8.50
|   |   |   |   |   |   |   |--- total_nights <= 2.50
|   |   |   |   |   |   |   |   |--- lead_time <= 5.50
|   |   |   |   |   |   |   |   |   |--- no_of_adults <= 1.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 2.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  2.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [28.33, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- no_of_adults >  1.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 74.40
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [17.89, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  74.40
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 14
|   |   |   |   |   |   |   |   |--- lead_time >  5.50
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 3.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [7.46, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- arrival_date >  3.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 68.38
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [4.47, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  68.38
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 12
|   |   |   |   |   |   |   |--- total_nights >  2.50
|   |   |   |   |   |   |   |   |--- no_of_adults <= 1.50
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 85.50
|   |   |   |   |   |   |   |   |   |   |--- no_of_adults <= 0.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- no_of_adults >  0.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 22.77] class: 1
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  85.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 89.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  89.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [2.24, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- no_of_adults >  1.50
|   |   |   |   |   |   |   |   |   |--- lead_time <= 2.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 20.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  20.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |--- lead_time >  2.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 13.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 5
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  13.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |--- arrival_month >  8.50
|   |   |   |   |   |   |   |--- total_nights <= 5.50
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 94.66
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 1.50
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 11.00
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  11.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |   |--- arrival_date >  1.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 90.17
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [116.31, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  90.17
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  94.66
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 95.10
|   |   |   |   |   |   |   |   |   |   |--- no_of_adults <= 1.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [2.98, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- no_of_adults >  1.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  95.10
|   |   |   |   |   |   |   |   |   |   |--- weights: [14.91, 0.00] class: 0
|   |   |   |   |   |   |   |--- total_nights >  5.50
|   |   |   |   |   |   |   |   |--- arrival_month <= 11.50
|   |   |   |   |   |   |   |   |   |--- lead_time <= 3.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- lead_time >  3.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 9.11] class: 1
|   |   |   |   |   |   |   |   |--- arrival_month >  11.50
|   |   |   |   |   |   |   |   |   |--- no_of_adults <= 1.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [2.98, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- no_of_adults >  1.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [2.98, 0.00] class: 0
|   |   |   |   |--- avg_price_per_room >  99.44
|   |   |   |   |   |--- lead_time <= 3.50
|   |   |   |   |   |   |--- avg_price_per_room <= 202.67
|   |   |   |   |   |   |   |--- total_nights <= 6.50
|   |   |   |   |   |   |   |   |--- arrival_month <= 5.50
|   |   |   |   |   |   |   |   |   |--- type_of_meal_plan_Not Selected <= 0.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 163.00
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 9
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  163.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [8.95, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- type_of_meal_plan_Not Selected >  0.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [10.44, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- arrival_month >  5.50
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 20.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 132.39
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [60.39, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  132.39
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 10
|   |   |   |   |   |   |   |   |   |--- arrival_date >  20.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 24.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 5
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  24.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 6
|   |   |   |   |   |   |   |--- total_nights >  6.50
|   |   |   |   |   |   |   |   |--- weights: [0.00, 6.07] class: 1
|   |   |   |   |   |   |--- avg_price_per_room >  202.67
|   |   |   |   |   |   |   |--- arrival_month <= 11.00
|   |   |   |   |   |   |   |   |--- weights: [0.00, 22.77] class: 1
|   |   |   |   |   |   |   |--- arrival_month >  11.00
|   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |--- lead_time >  3.50
|   |   |   |   |   |   |--- arrival_month <= 8.50
|   |   |   |   |   |   |   |--- avg_price_per_room <= 119.25
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 118.50
|   |   |   |   |   |   |   |   |   |--- lead_time <= 12.50
|   |   |   |   |   |   |   |   |   |   |--- no_of_adults <= 1.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 5
|   |   |   |   |   |   |   |   |   |   |--- no_of_adults >  1.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 11
|   |   |   |   |   |   |   |   |   |--- lead_time >  12.50
|   |   |   |   |   |   |   |   |   |   |--- type_of_meal_plan_Not Selected <= 0.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [2.98, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- type_of_meal_plan_Not Selected >  0.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  118.50
|   |   |   |   |   |   |   |   |   |--- arrival_month <= 3.50
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 4.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  4.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- arrival_month >  3.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [7.46, 0.00] class: 0
|   |   |   |   |   |   |   |--- avg_price_per_room >  119.25
|   |   |   |   |   |   |   |   |--- arrival_year <= 2017.50
|   |   |   |   |   |   |   |   |   |--- total_nights <= 1.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |   |--- total_nights >  1.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [2.98, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- arrival_year >  2017.50
|   |   |   |   |   |   |   |   |   |--- no_of_adults <= 2.50
|   |   |   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 5 <= 0.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 18
|   |   |   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 5 >  0.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- no_of_adults >  2.50
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 5.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  5.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 6
|   |   |   |   |   |   |--- arrival_month >  8.50
|   |   |   |   |   |   |   |--- arrival_year <= 2017.50
|   |   |   |   |   |   |   |   |--- total_nights <= 1.50
|   |   |   |   |   |   |   |   |   |--- lead_time <= 9.00
|   |   |   |   |   |   |   |   |   |   |--- weights: [3.73, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- lead_time >  9.00
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 10.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  10.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- total_nights >  1.50
|   |   |   |   |   |   |   |   |   |--- weights: [21.62, 0.00] class: 0
|   |   |   |   |   |   |   |--- arrival_year >  2017.50
|   |   |   |   |   |   |   |   |--- arrival_month <= 11.50
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 14.00
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 1.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [2.24, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  1.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 9
|   |   |   |   |   |   |   |   |   |--- arrival_date >  14.00
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 208.67
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  208.67
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 4.55] class: 1
|   |   |   |   |   |   |   |   |--- arrival_month >  11.50
|   |   |   |   |   |   |   |   |   |--- weights: [15.66, 0.00] class: 0
|   |   |   |--- lead_time >  13.50
|   |   |   |   |--- required_car_parking_space <= 0.50
|   |   |   |   |   |--- avg_price_per_room <= 71.92
|   |   |   |   |   |   |--- avg_price_per_room <= 59.43
|   |   |   |   |   |   |   |--- lead_time <= 84.50
|   |   |   |   |   |   |   |   |--- arrival_date <= 17.50
|   |   |   |   |   |   |   |   |   |--- lead_time <= 51.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 21.67
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [6.71, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  21.67
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 5
|   |   |   |   |   |   |   |   |   |--- lead_time >  51.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [12.67, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- arrival_date >  17.50
|   |   |   |   |   |   |   |   |   |--- weights: [23.11, 0.00] class: 0
|   |   |   |   |   |   |   |--- lead_time >  84.50
|   |   |   |   |   |   |   |   |--- arrival_year <= 2017.50
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 27.00
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 131.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  131.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |--- arrival_date >  27.00
|   |   |   |   |   |   |   |   |   |   |--- total_nights <= 2.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- total_nights >  2.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [2.98, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- arrival_year >  2017.50
|   |   |   |   |   |   |   |   |   |--- weights: [10.44, 0.00] class: 0
|   |   |   |   |   |   |--- avg_price_per_room >  59.43
|   |   |   |   |   |   |   |--- lead_time <= 25.50
|   |   |   |   |   |   |   |   |--- total_nights <= 4.50
|   |   |   |   |   |   |   |   |   |--- total_nights <= 1.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 69.06
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  69.06
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [1.49, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- total_nights >  1.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [14.91, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- total_nights >  4.50
|   |   |   |   |   |   |   |   |   |--- arrival_month <= 11.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 4.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  4.00
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |--- arrival_month >  11.50
|   |   |   |   |   |   |   |   |   |   |--- type_of_meal_plan_Not Selected <= 0.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [2.24, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- type_of_meal_plan_Not Selected >  0.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |--- lead_time >  25.50
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 71.34
|   |   |   |   |   |   |   |   |   |--- arrival_month <= 3.50
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 68.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 9
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  68.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 6
|   |   |   |   |   |   |   |   |   |--- arrival_month >  3.50
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 102.00
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 7
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  102.00
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  71.34
|   |   |   |   |   |   |   |   |   |--- weights: [11.18, 0.00] class: 0
|   |   |   |   |   |--- avg_price_per_room >  71.92
|   |   |   |   |   |   |--- arrival_year <= 2017.50
|   |   |   |   |   |   |   |--- lead_time <= 65.50
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 120.45
|   |   |   |   |   |   |   |   |   |--- no_of_adults <= 2.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 7.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  7.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 7
|   |   |   |   |   |   |   |   |   |--- no_of_adults >  2.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  120.45
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 17.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [2.98, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- arrival_date >  17.50
|   |   |   |   |   |   |   |   |   |   |--- total_nights <= 2.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [2.98, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- total_nights >  2.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |--- lead_time >  65.50
|   |   |   |   |   |   |   |   |--- type_of_meal_plan_Meal Plan 2 <= 0.50
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 27.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 75.75
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 10.63] class: 1
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  75.75
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 12
|   |   |   |   |   |   |   |   |   |--- arrival_date >  27.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [3.73, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- type_of_meal_plan_Meal Plan 2 >  0.50
|   |   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 5 <= 0.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 60.72] class: 1
|   |   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 5 >  0.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |   |   |--- arrival_year >  2017.50
|   |   |   |   |   |   |   |--- avg_price_per_room <= 104.31
|   |   |   |   |   |   |   |   |--- lead_time <= 25.50
|   |   |   |   |   |   |   |   |   |--- arrival_month <= 11.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 1.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [16.40, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  1.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 15
|   |   |   |   |   |   |   |   |   |--- arrival_month >  11.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [23.11, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- lead_time >  25.50
|   |   |   |   |   |   |   |   |   |--- type_of_meal_plan_Not Selected <= 0.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 5.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 20
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  5.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 15
|   |   |   |   |   |   |   |   |   |--- type_of_meal_plan_Not Selected >  0.50
|   |   |   |   |   |   |   |   |   |   |--- no_of_adults <= 1.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 5
|   |   |   |   |   |   |   |   |   |   |--- no_of_adults >  1.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 17
|   |   |   |   |   |   |   |--- avg_price_per_room >  104.31
|   |   |   |   |   |   |   |   |--- arrival_month <= 10.50
|   |   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 5 <= 0.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 195.30
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 25
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  195.30
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 5 >  0.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 22.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 5
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  22.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |--- arrival_month >  10.50
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 168.06
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 22.00
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  22.00
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 10
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  168.06
|   |   |   |   |   |   |   |   |   |   |--- no_of_children <= 0.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |   |--- no_of_children >  0.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |--- required_car_parking_space >  0.50
|   |   |   |   |   |--- total_nights <= 11.00
|   |   |   |   |   |   |--- weights: [48.46, 0.00] class: 0
|   |   |   |   |   |--- total_nights >  11.00
|   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |--- no_of_special_requests >  0.50
|   |   |--- no_of_special_requests <= 1.50
|   |   |   |--- market_segment_type_Online <= 0.50
|   |   |   |   |--- lead_time <= 102.50
|   |   |   |   |   |--- type_of_meal_plan_Not Selected <= 0.50
|   |   |   |   |   |   |--- total_nights <= 15.00
|   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 5 <= 0.50
|   |   |   |   |   |   |   |   |--- lead_time <= 91.50
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 129.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [632.23, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  129.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 131.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  131.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [20.13, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- lead_time >  91.50
|   |   |   |   |   |   |   |   |   |--- no_of_children <= 0.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 3.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [5.22, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  3.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [26.84, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- no_of_children >  0.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 16.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  16.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 5 >  0.50
|   |   |   |   |   |   |   |   |--- total_nights <= 4.50
|   |   |   |   |   |   |   |   |   |--- weights: [8.95, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- total_nights >  4.50
|   |   |   |   |   |   |   |   |   |--- market_segment_type_Offline <= 0.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |   |   |   |   |   |--- market_segment_type_Offline >  0.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |--- total_nights >  15.00
|   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |--- type_of_meal_plan_Not Selected >  0.50
|   |   |   |   |   |   |--- lead_time <= 63.00
|   |   |   |   |   |   |   |--- market_segment_type_Corporate <= 0.50
|   |   |   |   |   |   |   |   |--- weights: [13.42, 0.00] class: 0
|   |   |   |   |   |   |   |--- market_segment_type_Corporate >  0.50
|   |   |   |   |   |   |   |   |--- arrival_date <= 14.50
|   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- arrival_date >  14.50
|   |   |   |   |   |   |   |   |   |--- weights: [1.49, 1.52] class: 1
|   |   |   |   |   |   |--- lead_time >  63.00
|   |   |   |   |   |   |   |--- weights: [0.00, 7.59] class: 1
|   |   |   |   |--- lead_time >  102.50
|   |   |   |   |   |--- lead_time <= 104.50
|   |   |   |   |   |   |--- lead_time <= 103.50
|   |   |   |   |   |   |   |--- no_of_children <= 0.50
|   |   |   |   |   |   |   |   |--- weights: [3.73, 0.00] class: 0
|   |   |   |   |   |   |   |--- no_of_children >  0.50
|   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |--- lead_time >  103.50
|   |   |   |   |   |   |   |--- weights: [0.00, 4.55] class: 1
|   |   |   |   |   |--- lead_time >  104.50
|   |   |   |   |   |   |--- lead_time <= 150.50
|   |   |   |   |   |   |   |--- avg_price_per_room <= 141.75
|   |   |   |   |   |   |   |   |--- total_nights <= 3.50
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 81.00
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 3.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [5.22, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  3.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 6
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  81.00
|   |   |   |   |   |   |   |   |   |   |--- no_of_adults <= 2.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |   |--- no_of_adults >  2.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |--- total_nights >  3.50
|   |   |   |   |   |   |   |   |   |--- weights: [20.13, 0.00] class: 0
|   |   |   |   |   |   |   |--- avg_price_per_room >  141.75
|   |   |   |   |   |   |   |   |--- total_nights <= 5.00
|   |   |   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |   |   |   |   |--- total_nights >  5.00
|   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |--- lead_time >  150.50
|   |   |   |   |   |   |   |--- total_nights <= 2.50
|   |   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |   |   |   |--- total_nights >  2.50
|   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |--- market_segment_type_Online >  0.50
|   |   |   |   |--- lead_time <= 8.50
|   |   |   |   |   |--- lead_time <= 4.50
|   |   |   |   |   |   |--- total_nights <= 14.00
|   |   |   |   |   |   |   |--- avg_price_per_room <= 219.86
|   |   |   |   |   |   |   |   |--- total_nights <= 6.50
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 157.64
|   |   |   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 2 <= 0.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 13
|   |   |   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 2 >  0.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  157.64
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 158.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  158.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 8
|   |   |   |   |   |   |   |   |--- total_nights >  6.50
|   |   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 4 <= 0.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 4.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  4.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [5.96, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 4 >  0.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 6.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  6.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |   |   |   |--- avg_price_per_room >  219.86
|   |   |   |   |   |   |   |   |--- arrival_date <= 11.50
|   |   |   |   |   |   |   |   |   |--- weights: [3.73, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- arrival_date >  11.50
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 14.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |   |   |   |   |   |--- arrival_date >  14.50
|   |   |   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 4 <= 0.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [2.98, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 4 >  0.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |--- total_nights >  14.00
|   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |   |--- lead_time >  4.50
|   |   |   |   |   |   |--- arrival_date <= 13.50
|   |   |   |   |   |   |   |--- arrival_month <= 9.50
|   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 2 <= 0.50
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 88.39
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 3.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  3.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [11.93, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  88.39
|   |   |   |   |   |   |   |   |   |   |--- arrival_year <= 2017.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |   |--- arrival_year >  2017.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 8
|   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 2 >  0.50
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 94.48
|   |   |   |   |   |   |   |   |   |   |--- arrival_year <= 2017.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |   |   |--- arrival_year >  2017.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  94.48
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |--- arrival_month >  9.50
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 157.12
|   |   |   |   |   |   |   |   |   |--- weights: [32.06, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  157.12
|   |   |   |   |   |   |   |   |   |--- total_nights <= 3.00
|   |   |   |   |   |   |   |   |   |   |--- weights: [1.49, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- total_nights >  3.00
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |--- arrival_date >  13.50
|   |   |   |   |   |   |   |--- type_of_meal_plan_Not Selected <= 0.50
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 139.57
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 101.59
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 101.22
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  101.22
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  101.59
|   |   |   |   |   |   |   |   |   |   |--- weights: [57.41, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  139.57
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 15.50
|   |   |   |   |   |   |   |   |   |   |--- total_nights <= 1.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [1.49, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- total_nights >  1.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |--- arrival_date >  15.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 140.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  140.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |--- type_of_meal_plan_Not Selected >  0.50
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 126.33
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 21.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [17.89, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- arrival_date >  21.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 23.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  23.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [12.67, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  126.33
|   |   |   |   |   |   |   |   |   |--- total_nights <= 1.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 128.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  128.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |--- total_nights >  1.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 9.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [6.71, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  9.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |--- lead_time >  8.50
|   |   |   |   |   |--- required_car_parking_space <= 0.50
|   |   |   |   |   |   |--- avg_price_per_room <= 118.55
|   |   |   |   |   |   |   |--- lead_time <= 61.50
|   |   |   |   |   |   |   |   |--- arrival_month <= 11.50
|   |   |   |   |   |   |   |   |   |--- total_nights <= 6.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 1.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [65.61, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  1.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 20
|   |   |   |   |   |   |   |   |   |--- total_nights >  6.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 1.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [4.47, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  1.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 12
|   |   |   |   |   |   |   |   |--- arrival_month >  11.50
|   |   |   |   |   |   |   |   |   |--- total_nights <= 12.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [126.74, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- total_nights >  12.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |--- lead_time >  61.50
|   |   |   |   |   |   |   |   |--- arrival_year <= 2017.50
|   |   |   |   |   |   |   |   |   |--- arrival_month <= 7.50
|   |   |   |   |   |   |   |   |   |   |--- no_of_children <= 0.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 6
|   |   |   |   |   |   |   |   |   |   |--- no_of_children >  0.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |--- arrival_month >  7.50
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 66.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [5.22, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  66.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 9
|   |   |   |   |   |   |   |   |--- arrival_year >  2017.50
|   |   |   |   |   |   |   |   |   |--- arrival_month <= 9.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 71.93
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 5
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  71.93
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 16
|   |   |   |   |   |   |   |   |   |--- arrival_month >  9.50
|   |   |   |   |   |   |   |   |   |   |--- total_nights <= 2.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 12
|   |   |   |   |   |   |   |   |   |   |--- total_nights >  2.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 18
|   |   |   |   |   |   |--- avg_price_per_room >  118.55
|   |   |   |   |   |   |   |--- arrival_month <= 8.50
|   |   |   |   |   |   |   |   |--- arrival_date <= 19.50
|   |   |   |   |   |   |   |   |   |--- total_nights <= 9.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 177.15
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 17
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  177.15
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 7
|   |   |   |   |   |   |   |   |   |--- total_nights >  9.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 6.07] class: 1
|   |   |   |   |   |   |   |   |--- arrival_date >  19.50
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 27.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 121.20
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 8
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  121.20
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 12
|   |   |   |   |   |   |   |   |   |--- arrival_date >  27.50
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 55.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 10
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  55.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 8
|   |   |   |   |   |   |   |--- arrival_month >  8.50
|   |   |   |   |   |   |   |   |--- arrival_year <= 2017.50
|   |   |   |   |   |   |   |   |   |--- arrival_month <= 9.50
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 14.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  14.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |--- arrival_month >  9.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [37.28, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- arrival_year >  2017.50
|   |   |   |   |   |   |   |   |   |--- arrival_month <= 11.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 119.20
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 7
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  119.20
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 25
|   |   |   |   |   |   |   |   |   |--- arrival_month >  11.50
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 100.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [49.95, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  100.00
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |--- required_car_parking_space >  0.50
|   |   |   |   |   |   |--- total_nights <= 10.50
|   |   |   |   |   |   |   |--- weights: [134.20, 0.00] class: 0
|   |   |   |   |   |   |--- total_nights >  10.50
|   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |--- no_of_special_requests >  1.50
|   |   |   |--- lead_time <= 90.50
|   |   |   |   |--- total_nights <= 4.50
|   |   |   |   |   |--- total_nights <= 3.50
|   |   |   |   |   |   |--- weights: [1259.24, 0.00] class: 0
|   |   |   |   |   |--- total_nights >  3.50
|   |   |   |   |   |   |--- room_type_reserved_Room_Type 6 <= 0.50
|   |   |   |   |   |   |   |--- avg_price_per_room <= 90.05
|   |   |   |   |   |   |   |   |--- lead_time <= 48.00
|   |   |   |   |   |   |   |   |   |--- arrival_month <= 2.50
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 20.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  20.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [2.98, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- arrival_month >  2.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [45.48, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- lead_time >  48.00
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 89.85
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 14.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [13.42, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  14.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 5
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  89.85
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |--- avg_price_per_room >  90.05
|   |   |   |   |   |   |   |   |--- type_of_meal_plan_Not Selected <= 0.50
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 29.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [211.74, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- arrival_date >  29.50
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 54.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [12.67, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  54.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |--- type_of_meal_plan_Not Selected >  0.50
|   |   |   |   |   |   |   |   |   |--- lead_time <= 28.50
|   |   |   |   |   |   |   |   |   |   |--- repeated_guest <= 0.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [10.44, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- repeated_guest >  0.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- lead_time >  28.50
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 30.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  30.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |--- room_type_reserved_Room_Type 6 >  0.50
|   |   |   |   |   |   |   |--- lead_time <= 31.00
|   |   |   |   |   |   |   |   |--- arrival_year <= 2017.50
|   |   |   |   |   |   |   |   |   |--- weights: [2.24, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- arrival_year >  2017.50
|   |   |   |   |   |   |   |   |   |--- weights: [7.46, 0.00] class: 0
|   |   |   |   |   |   |   |--- lead_time >  31.00
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 159.42
|   |   |   |   |   |   |   |   |   |--- weights: [1.49, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  159.42
|   |   |   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |--- total_nights >  4.50
|   |   |   |   |   |--- total_nights <= 12.00
|   |   |   |   |   |   |--- no_of_special_requests <= 2.50
|   |   |   |   |   |   |   |--- total_nights <= 6.50
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 144.28
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 134.74
|   |   |   |   |   |   |   |   |   |   |--- arrival_year <= 2017.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 9
|   |   |   |   |   |   |   |   |   |   |--- arrival_year >  2017.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 5
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  134.74
|   |   |   |   |   |   |   |   |   |   |--- arrival_year <= 2017.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [1.49, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_year >  2017.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  144.28
|   |   |   |   |   |   |   |   |   |--- weights: [35.79, 0.00] class: 0
|   |   |   |   |   |   |   |--- total_nights >  6.50
|   |   |   |   |   |   |   |   |--- lead_time <= 9.00
|   |   |   |   |   |   |   |   |   |--- weights: [9.69, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- lead_time >  9.00
|   |   |   |   |   |   |   |   |   |--- lead_time <= 34.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 9.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [4.47, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  9.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 6
|   |   |   |   |   |   |   |   |   |--- lead_time >  34.50
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 72.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  72.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 5
|   |   |   |   |   |   |--- no_of_special_requests >  2.50
|   |   |   |   |   |   |   |--- weights: [51.44, 0.00] class: 0
|   |   |   |   |   |--- total_nights >  12.00
|   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |--- lead_time >  90.50
|   |   |   |   |--- no_of_special_requests <= 2.50
|   |   |   |   |   |--- arrival_month <= 8.50
|   |   |   |   |   |   |--- avg_price_per_room <= 202.95
|   |   |   |   |   |   |   |--- arrival_year <= 2017.50
|   |   |   |   |   |   |   |   |--- arrival_month <= 7.50
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 4.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- arrival_date >  4.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 26.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 7.59] class: 1
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  26.00
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |--- arrival_month >  7.50
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 24.50
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 98.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  98.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |--- arrival_date >  24.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |--- arrival_year >  2017.50
|   |   |   |   |   |   |   |   |--- lead_time <= 150.50
|   |   |   |   |   |   |   |   |   |--- total_nights <= 5.50
|   |   |   |   |   |   |   |   |   |   |--- total_nights <= 3.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 8
|   |   |   |   |   |   |   |   |   |   |--- total_nights >  3.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |--- total_nights >  5.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 5.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  5.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 5
|   |   |   |   |   |   |   |   |--- lead_time >  150.50
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 131.97
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  131.97
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |   |   |--- avg_price_per_room >  202.95
|   |   |   |   |   |   |   |--- no_of_children <= 1.00
|   |   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |   |   |   |--- no_of_children >  1.00
|   |   |   |   |   |   |   |   |--- weights: [0.00, 7.59] class: 1
|   |   |   |   |   |--- arrival_month >  8.50
|   |   |   |   |   |   |--- avg_price_per_room <= 153.15
|   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 2 <= 0.50
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 71.12
|   |   |   |   |   |   |   |   |   |--- weights: [3.73, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  71.12
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 90.42
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 11.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 6
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  11.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 7
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  90.42
|   |   |   |   |   |   |   |   |   |   |--- no_of_adults <= 1.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [5.96, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- no_of_adults >  1.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 16
|   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 2 >  0.50
|   |   |   |   |   |   |   |   |--- weights: [5.96, 0.00] class: 0
|   |   |   |   |   |   |--- avg_price_per_room >  153.15
|   |   |   |   |   |   |   |--- arrival_date <= 22.50
|   |   |   |   |   |   |   |   |--- weights: [8.20, 0.00] class: 0
|   |   |   |   |   |   |   |--- arrival_date >  22.50
|   |   |   |   |   |   |   |   |--- lead_time <= 106.50
|   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |--- lead_time >  106.50
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 23.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |   |--- arrival_date >  23.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [4.47, 0.00] class: 0
|   |   |   |   |--- no_of_special_requests >  2.50
|   |   |   |   |   |--- weights: [67.10, 0.00] class: 0
|--- lead_time >  151.50
|   |--- avg_price_per_room <= 100.04
|   |   |--- no_of_special_requests <= 0.50
|   |   |   |--- no_of_adults <= 1.50
|   |   |   |   |--- market_segment_type_Online <= 0.50
|   |   |   |   |   |--- lead_time <= 163.50
|   |   |   |   |   |   |--- arrival_month <= 5.00
|   |   |   |   |   |   |   |--- weights: [2.98, 0.00] class: 0
|   |   |   |   |   |   |--- arrival_month >  5.00
|   |   |   |   |   |   |   |--- avg_price_per_room <= 80.00
|   |   |   |   |   |   |   |   |--- weights: [0.75, 1.52] class: 1
|   |   |   |   |   |   |   |--- avg_price_per_room >  80.00
|   |   |   |   |   |   |   |   |--- weights: [0.00, 22.77] class: 1
|   |   |   |   |   |--- lead_time >  163.50
|   |   |   |   |   |   |--- lead_time <= 341.00
|   |   |   |   |   |   |   |--- lead_time <= 173.00
|   |   |   |   |   |   |   |   |--- arrival_date <= 3.50
|   |   |   |   |   |   |   |   |   |--- total_nights <= 1.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- total_nights >  1.50
|   |   |   |   |   |   |   |   |   |   |--- total_nights <= 3.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [45.48, 9.11] class: 0
|   |   |   |   |   |   |   |   |   |   |--- total_nights >  3.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- arrival_date >  3.50
|   |   |   |   |   |   |   |   |   |--- total_nights <= 3.00
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 13.66] class: 1
|   |   |   |   |   |   |   |   |   |--- total_nights >  3.00
|   |   |   |   |   |   |   |   |   |   |--- weights: [2.24, 0.00] class: 0
|   |   |   |   |   |   |   |--- lead_time >  173.00
|   |   |   |   |   |   |   |   |--- arrival_month <= 5.50
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 7.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 4.55] class: 1
|   |   |   |   |   |   |   |   |   |--- arrival_date >  7.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [6.71, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- arrival_month >  5.50
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 98.00
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 55.21
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  55.21
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 6
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  98.00
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 231.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  231.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |--- lead_time >  341.00
|   |   |   |   |   |   |   |--- arrival_date <= 8.50
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 88.33
|   |   |   |   |   |   |   |   |   |--- weights: [0.00, 10.63] class: 1
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  88.33
|   |   |   |   |   |   |   |   |   |--- weights: [0.75, 1.52] class: 1
|   |   |   |   |   |   |   |--- arrival_date >  8.50
|   |   |   |   |   |   |   |   |--- arrival_date <= 24.50
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 80.00
|   |   |   |   |   |   |   |   |   |   |--- weights: [3.73, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  80.00
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 381.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  381.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [2.24, 3.04] class: 1
|   |   |   |   |   |   |   |   |--- arrival_date >  24.50
|   |   |   |   |   |   |   |   |   |--- weights: [0.00, 4.55] class: 1
|   |   |   |   |--- market_segment_type_Online >  0.50
|   |   |   |   |   |--- avg_price_per_room <= 2.50
|   |   |   |   |   |   |--- lead_time <= 285.50
|   |   |   |   |   |   |   |--- type_of_meal_plan_Meal Plan 2 <= 0.50
|   |   |   |   |   |   |   |   |--- weights: [7.46, 0.00] class: 0
|   |   |   |   |   |   |   |--- type_of_meal_plan_Meal Plan 2 >  0.50
|   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |--- lead_time >  285.50
|   |   |   |   |   |   |   |--- arrival_month <= 9.50
|   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |--- arrival_month >  9.50
|   |   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |   |--- avg_price_per_room >  2.50
|   |   |   |   |   |   |--- arrival_date <= 29.50
|   |   |   |   |   |   |   |--- weights: [0.00, 88.05] class: 1
|   |   |   |   |   |   |--- arrival_date >  29.50
|   |   |   |   |   |   |   |--- arrival_month <= 11.50
|   |   |   |   |   |   |   |   |--- weights: [0.00, 9.11] class: 1
|   |   |   |   |   |   |   |--- arrival_month >  11.50
|   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |--- no_of_adults >  1.50
|   |   |   |   |--- avg_price_per_room <= 82.47
|   |   |   |   |   |--- market_segment_type_Offline <= 0.50
|   |   |   |   |   |   |--- arrival_month <= 11.50
|   |   |   |   |   |   |   |--- weights: [0.00, 197.36] class: 1
|   |   |   |   |   |   |--- arrival_month >  11.50
|   |   |   |   |   |   |   |--- total_nights <= 1.50
|   |   |   |   |   |   |   |   |--- arrival_date <= 14.50
|   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |--- arrival_date >  14.50
|   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |--- total_nights >  1.50
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 80.51
|   |   |   |   |   |   |   |   |   |--- total_nights <= 3.50
|   |   |   |   |   |   |   |   |   |   |--- type_of_meal_plan_Not Selected <= 0.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |   |--- type_of_meal_plan_Not Selected >  0.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 19.74] class: 1
|   |   |   |   |   |   |   |   |   |--- total_nights >  3.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 57.69] class: 1
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  80.51
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 81.43
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  81.43
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |   |--- market_segment_type_Offline >  0.50
|   |   |   |   |   |   |--- arrival_month <= 11.50
|   |   |   |   |   |   |   |--- lead_time <= 244.00
|   |   |   |   |   |   |   |   |--- total_nights <= 2.50
|   |   |   |   |   |   |   |   |   |--- lead_time <= 166.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [2.24, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- lead_time >  166.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 19.00
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  19.00
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |--- total_nights >  2.50
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 11.50
|   |   |   |   |   |   |   |   |   |   |--- type_of_meal_plan_Meal Plan 2 <= 0.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 6
|   |   |   |   |   |   |   |   |   |   |--- type_of_meal_plan_Meal Plan 2 >  0.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |   |--- arrival_date >  11.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 15.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  15.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 7
|   |   |   |   |   |   |   |--- lead_time >  244.00
|   |   |   |   |   |   |   |   |--- arrival_year <= 2017.50
|   |   |   |   |   |   |   |   |   |--- weights: [25.35, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- arrival_year >  2017.50
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 80.38
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 76.00
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 9
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  76.00
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  80.38
|   |   |   |   |   |   |   |   |   |   |--- weights: [7.46, 0.00] class: 0
|   |   |   |   |   |   |--- arrival_month >  11.50
|   |   |   |   |   |   |   |--- weights: [46.22, 0.00] class: 0
|   |   |   |   |--- avg_price_per_room >  82.47
|   |   |   |   |   |--- no_of_adults <= 2.50
|   |   |   |   |   |   |--- lead_time <= 324.50
|   |   |   |   |   |   |   |--- arrival_month <= 11.50
|   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 4 <= 0.50
|   |   |   |   |   |   |   |   |   |--- market_segment_type_Corporate <= 0.50
|   |   |   |   |   |   |   |   |   |   |--- market_segment_type_Online <= 0.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 7
|   |   |   |   |   |   |   |   |   |   |--- market_segment_type_Online >  0.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 499.46] class: 1
|   |   |   |   |   |   |   |   |   |--- market_segment_type_Corporate >  0.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- room_type_reserved_Room_Type 4 >  0.50
|   |   |   |   |   |   |   |   |   |--- market_segment_type_Online <= 0.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [4.47, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- market_segment_type_Online >  0.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 10.63] class: 1
|   |   |   |   |   |   |   |--- arrival_month >  11.50
|   |   |   |   |   |   |   |   |--- market_segment_type_Offline <= 0.50
|   |   |   |   |   |   |   |   |   |--- weights: [0.00, 19.74] class: 1
|   |   |   |   |   |   |   |   |--- market_segment_type_Offline >  0.50
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 15.00
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- arrival_date >  15.00
|   |   |   |   |   |   |   |   |   |   |--- weights: [4.47, 0.00] class: 0
|   |   |   |   |   |   |--- lead_time >  324.50
|   |   |   |   |   |   |   |--- avg_price_per_room <= 89.00
|   |   |   |   |   |   |   |   |--- weights: [5.96, 0.00] class: 0
|   |   |   |   |   |   |   |--- avg_price_per_room >  89.00
|   |   |   |   |   |   |   |   |--- market_segment_type_Offline <= 0.50
|   |   |   |   |   |   |   |   |   |--- weights: [0.00, 6.07] class: 1
|   |   |   |   |   |   |   |   |--- market_segment_type_Offline >  0.50
|   |   |   |   |   |   |   |   |   |--- weights: [0.75, 7.59] class: 1
|   |   |   |   |   |--- no_of_adults >  2.50
|   |   |   |   |   |   |--- weights: [5.22, 0.00] class: 0
|   |   |--- no_of_special_requests >  0.50
|   |   |   |--- market_segment_type_Offline <= 0.50
|   |   |   |   |--- lead_time <= 180.50
|   |   |   |   |   |--- lead_time <= 159.50
|   |   |   |   |   |   |--- arrival_month <= 8.50
|   |   |   |   |   |   |   |--- lead_time <= 152.50
|   |   |   |   |   |   |   |   |--- arrival_date <= 23.00
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 4.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |   |   |   |   |   |--- arrival_date >  4.50
|   |   |   |   |   |   |   |   |   |   |--- type_of_meal_plan_Not Selected <= 0.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- type_of_meal_plan_Not Selected >  0.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |--- arrival_date >  23.00
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 87.39
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  87.39
|   |   |   |   |   |   |   |   |   |   |--- weights: [1.49, 0.00] class: 0
|   |   |   |   |   |   |   |--- lead_time >  152.50
|   |   |   |   |   |   |   |   |--- lead_time <= 156.50
|   |   |   |   |   |   |   |   |   |--- weights: [8.95, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- lead_time >  156.50
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 29.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 10.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 5
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  10.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |--- arrival_date >  29.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |--- arrival_month >  8.50
|   |   |   |   |   |   |   |--- arrival_date <= 23.50
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 87.12
|   |   |   |   |   |   |   |   |   |--- no_of_special_requests <= 1.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |   |--- no_of_special_requests >  1.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  87.12
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 89.38
|   |   |   |   |   |   |   |   |   |   |--- weights: [1.49, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  89.38
|   |   |   |   |   |   |   |   |   |   |--- weights: [1.49, 0.00] class: 0
|   |   |   |   |   |   |   |--- arrival_date >  23.50
|   |   |   |   |   |   |   |   |--- no_of_adults <= 0.50
|   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |--- no_of_adults >  0.50
|   |   |   |   |   |   |   |   |   |--- weights: [0.00, 10.63] class: 1
|   |   |   |   |   |--- lead_time >  159.50
|   |   |   |   |   |   |--- no_of_adults <= 0.50
|   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |--- no_of_adults >  0.50
|   |   |   |   |   |   |   |--- avg_price_per_room <= 93.44
|   |   |   |   |   |   |   |   |--- arrival_date <= 28.50
|   |   |   |   |   |   |   |   |   |--- arrival_year <= 2017.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_date <= 25.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [2.24, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_date >  25.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |   |--- arrival_year >  2017.50
|   |   |   |   |   |   |   |   |   |   |--- total_nights <= 1.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- total_nights >  1.50
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [48.46, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- arrival_date >  28.50
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 30.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |   |   |   |   |   |--- arrival_date >  30.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |--- avg_price_per_room >  93.44
|   |   |   |   |   |   |   |   |--- lead_time <= 178.50
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 18.50
|   |   |   |   |   |   |   |   |   |   |--- lead_time <= 170.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 5
|   |   |   |   |   |   |   |   |   |   |--- lead_time >  170.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 4
|   |   |   |   |   |   |   |   |   |--- arrival_date >  18.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [13.42, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- lead_time >  178.50
|   |   |   |   |   |   |   |   |   |--- lead_time <= 179.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 4.55] class: 1
|   |   |   |   |   |   |   |   |   |--- lead_time >  179.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 97.82
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [1.49, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  97.82
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [2.98, 1.52] class: 0
|   |   |   |   |--- lead_time >  180.50
|   |   |   |   |   |--- total_nights <= 3.50
|   |   |   |   |   |   |--- no_of_special_requests <= 2.50
|   |   |   |   |   |   |   |--- avg_price_per_room <= 66.47
|   |   |   |   |   |   |   |   |--- weights: [2.24, 0.00] class: 0
|   |   |   |   |   |   |   |--- avg_price_per_room >  66.47
|   |   |   |   |   |   |   |   |--- lead_time <= 187.50
|   |   |   |   |   |   |   |   |   |--- arrival_month <= 4.00
|   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- arrival_month >  4.00
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 78.30
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  78.30
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |--- lead_time >  187.50
|   |   |   |   |   |   |   |   |   |--- lead_time <= 304.50
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 99.30
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 14
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  99.30
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [1.49, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- lead_time >  304.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 9.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  9.00
|   |   |   |   |   |   |   |   |   |   |   |--- weights: [0.00, 25.81] class: 1
|   |   |   |   |   |   |--- no_of_special_requests >  2.50
|   |   |   |   |   |   |   |--- weights: [8.20, 0.00] class: 0
|   |   |   |   |   |--- total_nights >  3.50
|   |   |   |   |   |   |--- arrival_year <= 2017.50
|   |   |   |   |   |   |   |--- weights: [14.17, 0.00] class: 0
|   |   |   |   |   |   |--- arrival_year >  2017.50
|   |   |   |   |   |   |   |--- total_nights <= 11.50
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 69.40
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 64.43
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room <= 55.92
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  55.92
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 2
|   |   |   |   |   |   |   |   |   |--- avg_price_per_room >  64.43
|   |   |   |   |   |   |   |   |   |   |--- weights: [8.20, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  69.40
|   |   |   |   |   |   |   |   |   |--- no_of_special_requests <= 2.50
|   |   |   |   |   |   |   |   |   |   |--- arrival_month <= 10.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 13
|   |   |   |   |   |   |   |   |   |   |--- arrival_month >  10.50
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 14
|   |   |   |   |   |   |   |   |   |--- no_of_special_requests >  2.50
|   |   |   |   |   |   |   |   |   |   |--- weights: [5.22, 0.00] class: 0
|   |   |   |   |   |   |   |--- total_nights >  11.50
|   |   |   |   |   |   |   |   |--- lead_time <= 198.00
|   |   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- lead_time >  198.00
|   |   |   |   |   |   |   |   |   |--- weights: [0.00, 10.63] class: 1
|   |   |   |--- market_segment_type_Offline >  0.50
|   |   |   |   |--- lead_time <= 348.50
|   |   |   |   |   |--- no_of_adults <= 2.50
|   |   |   |   |   |   |--- total_nights <= 7.50
|   |   |   |   |   |   |   |--- arrival_date <= 30.50
|   |   |   |   |   |   |   |   |--- lead_time <= 331.00
|   |   |   |   |   |   |   |   |   |--- weights: [108.85, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- lead_time >  331.00
|   |   |   |   |   |   |   |   |   |--- arrival_date <= 10.00
|   |   |   |   |   |   |   |   |   |   |--- weights: [5.96, 0.00] class: 0
|   |   |   |   |   |   |   |   |   |--- arrival_date >  10.00
|   |   |   |   |   |   |   |   |   |   |--- weights: [1.49, 1.52] class: 1
|   |   |   |   |   |   |   |--- arrival_date >  30.50
|   |   |   |   |   |   |   |   |--- total_nights <= 5.00
|   |   |   |   |   |   |   |   |   |--- weights: [2.24, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- total_nights >  5.00
|   |   |   |   |   |   |   |   |   |--- weights: [1.49, 1.52] class: 1
|   |   |   |   |   |   |--- total_nights >  7.50
|   |   |   |   |   |   |   |--- no_of_special_requests <= 1.50
|   |   |   |   |   |   |   |   |--- weights: [1.49, 0.00] class: 0
|   |   |   |   |   |   |   |--- no_of_special_requests >  1.50
|   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |--- no_of_adults >  2.50
|   |   |   |   |   |   |--- lead_time <= 196.00
|   |   |   |   |   |   |   |--- avg_price_per_room <= 94.95
|   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |   |--- avg_price_per_room >  94.95
|   |   |   |   |   |   |   |   |--- weights: [4.47, 0.00] class: 0
|   |   |   |   |   |   |--- lead_time >  196.00
|   |   |   |   |   |   |   |--- arrival_date <= 21.50
|   |   |   |   |   |   |   |   |--- weights: [0.00, 3.04] class: 1
|   |   |   |   |   |   |   |--- arrival_date >  21.50
|   |   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |--- lead_time >  348.50
|   |   |   |   |   |--- total_nights <= 3.50
|   |   |   |   |   |   |--- arrival_date <= 18.50
|   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |--- arrival_date >  18.50
|   |   |   |   |   |   |   |--- weights: [0.75, 1.52] class: 1
|   |   |   |   |   |--- total_nights >  3.50
|   |   |   |   |   |   |--- avg_price_per_room <= 58.50
|   |   |   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |   |   |--- avg_price_per_room >  58.50
|   |   |   |   |   |   |   |--- weights: [4.47, 3.04] class: 0
|   |--- avg_price_per_room >  100.04
|   |   |--- arrival_month <= 11.50
|   |   |   |--- no_of_special_requests <= 2.50
|   |   |   |   |--- arrival_year <= 2017.50
|   |   |   |   |   |--- weights: [0.00, 133.59] class: 1
|   |   |   |   |--- arrival_year >  2017.50
|   |   |   |   |   |--- weights: [0.00, 3066.59] class: 1
|   |   |   |--- no_of_special_requests >  2.50
|   |   |   |   |--- weights: [23.11, 0.00] class: 0
|   |   |--- arrival_month >  11.50
|   |   |   |--- no_of_special_requests <= 0.50
|   |   |   |   |--- total_nights <= 1.50
|   |   |   |   |   |--- weights: [0.75, 0.00] class: 0
|   |   |   |   |--- total_nights >  1.50
|   |   |   |   |   |--- weights: [34.30, 0.00] class: 0
|   |   |   |--- no_of_special_requests >  0.50
|   |   |   |   |--- arrival_date <= 24.50
|   |   |   |   |   |--- weights: [3.73, 0.00] class: 0
|   |   |   |   |--- arrival_date >  24.50
|   |   |   |   |   |--- lead_time <= 172.50
|   |   |   |   |   |   |--- avg_price_per_room <= 135.49
|   |   |   |   |   |   |   |--- weights: [2.24, 0.00] class: 0
|   |   |   |   |   |   |--- avg_price_per_room >  135.49
|   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |--- lead_time >  172.50
|   |   |   |   |   |   |--- no_of_special_requests <= 1.50
|   |   |   |   |   |   |   |--- weights: [0.00, 13.66] class: 1
|   |   |   |   |   |   |--- no_of_special_requests >  1.50
|   |   |   |   |   |   |   |--- no_of_adults <= 2.50
|   |   |   |   |   |   |   |   |--- avg_price_per_room <= 139.01
|   |   |   |   |   |   |   |   |   |--- weights: [1.49, 0.00] class: 0
|   |   |   |   |   |   |   |   |--- avg_price_per_room >  139.01
|   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |--- no_of_adults >  2.50
|   |   |   |   |   |   |   |   |--- arrival_date <= 26.50
|   |   |   |   |   |   |   |   |   |--- weights: [0.00, 1.52] class: 1
|   |   |   |   |   |   |   |   |--- arrival_date >  26.50
|   |   |   |   |   |   |   |   |   |--- weights: [0.00, 4.55] class: 1

In [184]:
plt.figure(figsize=(20,30))
tree.plot_tree(decisiontree,feature_names=feature_names,filled=True,fontsize=9,node_ids=True,class_names=True)
plt.show()
Out[184]:
<Figure size 2000x3000 with 0 Axes>
Out[184]:
[Text(0.7131192802725084, 0.9861111111111112, 'node #0\nlead_time <= 151.5\ngini = 0.5\nsamples = 25392\nvalue = [12696.0, 12696.0]\nclass = y[0]'),
 Text(0.45287949672850264, 0.9583333333333334, 'node #1\nno_of_special_requests <= 0.5\ngini = 0.472\nsamples = 20410\nvalue = [11676.085, 7209.531]\nclass = y[0]'),
 Text(0.1735591734789969, 0.9305555555555556, 'node #2\nmarket_segment_type_Online <= 0.5\ngini = 0.5\nsamples = 10667\nvalue = [5306.837, 5387.792]\nclass = y[1]'),
 Text(0.10714157535355012, 0.9027777777777778, 'node #3\nlead_time <= 90.5\ngini = 0.381\nsamples = 5395\nvalue = [3439.976, 1185.648]\nclass = y[0]'),
 Text(0.08374171917497142, 0.875, 'node #4\ntotal_nights <= 5.5\ngini = 0.27\nsamples = 4149\nvalue = [2827.132, 541.967]\nclass = y[0]'),
 Text(0.0721436700303304, 0.8472222222222222, 'node #5\navg_price_per_room <= 201.5\ngini = 0.24\nsamples = 3970\nvalue = [2741.394, 444.808]\nclass = y[0]'),
 Text(0.05984772156590466, 0.8194444444444444, 'node #6\nlead_time <= 74.5\ngini = 0.23\nsamples = 3951\nvalue = [2739.903, 419.0]\nclass = y[0]'),
 Text(0.039912402052652736, 0.7916666666666666, 'node #7\narrival_month <= 5.5\ngini = 0.199\nsamples = 3622\nvalue = [2542.331, 321.84]\nclass = y[0]'),
 Text(0.023801662168440724, 0.7638888888888888, 'node #8\narrival_date <= 27.5\ngini = 0.307\nsamples = 1067\nvalue = [713.493, 166.993]\nclass = y[0]'),
 Text(0.018828599320306748, 0.7361111111111112, 'node #9\nlead_time <= 59.5\ngini = 0.244\nsamples = 957\nvalue = [659.813, 109.304]\nclass = y[0]'),
 Text(0.013434643474266145, 0.7083333333333334, 'node #10\nmarket_segment_type_Offline <= 0.5\ngini = 0.199\nsamples = 876\nvalue = [615.08, 77.424]\nclass = y[0]'),
 Text(0.00739205562835083, 0.6805555555555556, 'node #11\nlead_time <= 16.5\ngini = 0.329\nsamples = 351\nvalue = [231.867, 60.725]\nclass = y[0]'),
 Text(0.0050950891901627195, 0.6527777777777778, 'node #12\nrepeated_guest <= 0.5\ngini = 0.224\nsamples = 266\nvalue = [184.897, 27.326]\nclass = y[0]'),
 Text(0.004760984980971722, 0.625, 'node #13\nlead_time <= 6.5\ngini = 0.296\nsamples = 184\nvalue = [123.762, 27.326]\nclass = y[0]'),
 Text(0.003173989987314481, 0.5972222222222222, 'node #14\navg_price_per_room <= 79.5\ngini = 0.373\nsamples = 122\nvalue = [78.283, 25.808]\nclass = y[0]'),
 Text(0.00167052104595499, 0.5694444444444444, 'node #15\ntotal_nights <= 3.5\ngini = 0.183\nsamples = 57\nvalue = [40.26, 4.554]\nclass = y[0]'),
 Text(0.0013364168367639919, 0.5416666666666666, 'node #16\navg_price_per_room <= 65.5\ngini = 0.13\nsamples = 56\nvalue = [40.26, 3.036]\nclass = y[0]'),
 Text(0.0006682084183819959, 0.5138888888888888, 'node #17\navg_price_per_room <= 64.5\ngini = 0.241\nsamples = 27\nvalue = [18.639, 3.036]\nclass = y[0]'),
 Text(0.00033410420919099796, 0.4861111111111111, 'node #18\ngini = -0.0\nsamples = 17\nvalue = [12.674, 0.0]\nclass = y[0]'),
 Text(0.001002312627572994, 0.4861111111111111, 'node #19\narrival_date <= 11.0\ngini = 0.447\nsamples = 10\nvalue = [5.964, 3.036]\nclass = y[0]'),
 Text(0.0006682084183819959, 0.4583333333333333, 'node #20\ngini = 0.0\nsamples = 5\nvalue = [3.728, 0.0]\nclass = y[0]'),
 Text(0.0013364168367639919, 0.4583333333333333, 'node #21\narrival_date <= 17.0\ngini = 0.489\nsamples = 5\nvalue = [2.237, 3.036]\nclass = y[1]'),
 Text(0.0006682084183819959, 0.4305555555555556, 'node #22\narrival_month <= 3.0\ngini = 0.317\nsamples = 3\nvalue = [0.746, 3.036]\nclass = y[1]'),
 Text(0.00033410420919099796, 0.4027777777777778, 'node #23\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.001002312627572994, 0.4027777777777778, 'node #24\nlead_time <= 0.5\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.0006682084183819959, 0.375, 'node #25\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.0013364168367639919, 0.375, 'node #26\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.002004625255145988, 0.4305555555555556, 'node #27\narrival_month <= 3.0\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.00167052104595499, 0.4027777777777778, 'node #28\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.002338729464336986, 0.4027777777777778, 'node #29\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.002004625255145988, 0.5138888888888888, 'node #30\ntotal_nights <= 1.5\ngini = 0.0\nsamples = 29\nvalue = [21.621, 0.0]\nclass = y[0]'),
 Text(0.00167052104595499, 0.4861111111111111, 'node #31\ngini = 0.0\nsamples = 20\nvalue = [14.911, 0.0]\nclass = y[0]'),
 Text(0.002338729464336986, 0.4861111111111111, 'node #32\ngini = 0.0\nsamples = 9\nvalue = [6.71, 0.0]\nclass = y[0]'),
 Text(0.002004625255145988, 0.5416666666666666, 'node #33\ngini = -0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.004677458928673972, 0.5694444444444444, 'node #34\narrival_date <= 8.0\ngini = 0.46\nsamples = 65\nvalue = [38.023, 21.254]\nclass = y[0]'),
 Text(0.004343354719482974, 0.5416666666666666, 'node #35\ngini = 0.0\nsamples = 16\nvalue = [11.929, 0.0]\nclass = y[0]'),
 Text(0.00501156313786497, 0.5416666666666666, 'node #36\ntotal_nights <= 2.5\ngini = 0.495\nsamples = 49\nvalue = [26.094, 21.254]\nclass = y[0]'),
 Text(0.004677458928673972, 0.5138888888888888, 'node #37\narrival_date <= 24.5\ngini = 0.499\nsamples = 40\nvalue = [19.384, 21.254]\nclass = y[1]'),
 Text(0.004343354719482974, 0.4861111111111111, 'node #38\nno_of_adults <= 1.5\ngini = 0.5\nsamples = 38\nvalue = [19.384, 18.217]\nclass = y[0]'),
 Text(0.0036751463011009777, 0.4583333333333333, 'node #39\navg_price_per_room <= 86.5\ngini = 0.482\nsamples = 28\nvalue = [15.657, 10.627]\nclass = y[0]'),
 Text(0.00334104209190998, 0.4305555555555556, 'node #40\ngini = 0.0\nsamples = 9\nvalue = [6.71, 0.0]\nclass = y[0]'),
 Text(0.004009250510291976, 0.4305555555555556, 'node #41\narrival_date <= 22.5\ngini = 0.496\nsamples = 19\nvalue = [8.947, 10.627]\nclass = y[1]'),
 Text(0.0036751463011009777, 0.4027777777777778, 'node #42\nlead_time <= 2.5\ngini = 0.474\nsamples = 16\nvalue = [6.71, 10.627]\nclass = y[1]'),
 Text(0.002004625255145988, 0.375, 'node #43\nlead_time <= 1.5\ngini = 0.405\nsamples = 9\nvalue = [2.982, 7.591]\nclass = y[1]'),
 Text(0.0013364168367639919, 0.3472222222222222, 'node #44\nmarket_segment_type_Corporate <= 0.5\ngini = 0.489\nsamples = 5\nvalue = [2.237, 3.036]\nclass = y[1]'),
 Text(0.001002312627572994, 0.3194444444444444, 'node #45\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.00167052104595499, 0.3194444444444444, 'node #46\navg_price_per_room <= 151.59\ngini = 0.482\nsamples = 4\nvalue = [2.237, 1.518]\nclass = y[0]'),
 Text(0.0013364168367639919, 0.2916666666666667, 'node #47\navg_price_per_room <= 92.5\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.001002312627572994, 0.2638888888888889, 'node #48\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.00167052104595499, 0.2638888888888889, 'node #49\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.002004625255145988, 0.2916666666666667, 'node #50\ngini = -0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.0026728336735279837, 0.3472222222222222, 'node #51\navg_price_per_room <= 99.0\ngini = 0.242\nsamples = 4\nvalue = [0.746, 4.554]\nclass = y[1]'),
 Text(0.002338729464336986, 0.3194444444444444, 'node #52\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.003006937882718982, 0.3194444444444444, 'node #53\narrival_date <= 20.5\ngini = 0.317\nsamples = 3\nvalue = [0.746, 3.036]\nclass = y[1]'),
 Text(0.0026728336735279837, 0.2916666666666667, 'node #54\ntotal_nights <= 1.5\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.002338729464336986, 0.2638888888888889, 'node #55\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.003006937882718982, 0.2638888888888889, 'node #56\ngini = -0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.00334104209190998, 0.2916666666666667, 'node #57\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.005345667347055967, 0.375, 'node #58\navg_price_per_room <= 99.5\ngini = 0.495\nsamples = 7\nvalue = [3.728, 3.036]\nclass = y[0]'),
 Text(0.004677458928673972, 0.3472222222222222, 'node #59\ntotal_nights <= 1.5\ngini = 0.442\nsamples = 4\nvalue = [1.491, 3.036]\nclass = y[1]'),
 Text(0.004343354719482974, 0.3194444444444444, 'node #60\narrival_month <= 3.5\ngini = 0.317\nsamples = 3\nvalue = [0.746, 3.036]\nclass = y[1]'),
 Text(0.004009250510291976, 0.2916666666666667, 'node #61\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.004677458928673972, 0.2916666666666667, 'node #62\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.00501156313786497, 0.3194444444444444, 'node #63\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.006013875765437964, 0.3472222222222222, 'node #64\ntotal_nights <= 1.5\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.005679771556246965, 0.3194444444444444, 'node #65\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.006347979974628962, 0.3194444444444444, 'node #66\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.004343354719482974, 0.4027777777777778, 'node #67\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.00501156313786497, 0.4583333333333333, 'node #68\navg_price_per_room <= 87.5\ngini = 0.442\nsamples = 10\nvalue = [3.728, 7.591]\nclass = y[1]'),
 Text(0.004677458928673972, 0.4305555555555556, 'node #69\ngini = 0.0\nsamples = 4\nvalue = [0.0, 6.072]\nclass = y[1]'),
 Text(0.005345667347055967, 0.4305555555555556, 'node #70\narrival_date <= 22.0\ngini = 0.411\nsamples = 6\nvalue = [3.728, 1.518]\nclass = y[0]'),
 Text(0.00501156313786497, 0.4027777777777778, 'node #71\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.005679771556246965, 0.4027777777777778, 'node #72\ngini = 0.5\nsamples = 3\nvalue = [1.491, 1.518]\nclass = y[1]'),
 Text(0.00501156313786497, 0.4861111111111111, 'node #73\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.005345667347055967, 0.5138888888888888, 'node #74\ngini = 0.0\nsamples = 9\nvalue = [6.71, 0.0]\nclass = y[0]'),
 Text(0.006347979974628962, 0.5972222222222222, 'node #75\narrival_date <= 2.5\ngini = 0.063\nsamples = 62\nvalue = [45.479, 1.518]\nclass = y[0]'),
 Text(0.006013875765437964, 0.5694444444444444, 'node #76\navg_price_per_room <= 55.5\ngini = 0.5\nsamples = 3\nvalue = [1.491, 1.518]\nclass = y[1]'),
 Text(0.005679771556246965, 0.5416666666666666, 'node #77\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.006347979974628962, 0.5416666666666666, 'node #78\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.00668208418381996, 0.5694444444444444, 'node #79\ngini = 0.0\nsamples = 59\nvalue = [43.988, 0.0]\nclass = y[0]'),
 Text(0.005429193399353717, 0.625, 'node #80\ngini = 0.0\nsamples = 82\nvalue = [61.135, 0.0]\nclass = y[0]'),
 Text(0.009689022066538941, 0.6527777777777778, 'node #81\narrival_date <= 16.5\ngini = 0.486\nsamples = 85\nvalue = [46.97, 33.399]\nclass = y[0]'),
 Text(0.00835260522977495, 0.625, 'node #82\nlead_time <= 44.5\ngini = 0.301\nsamples = 40\nvalue = [26.84, 6.072]\nclass = y[0]'),
 Text(0.007684396811392953, 0.5972222222222222, 'node #83\narrival_date <= 4.5\ngini = 0.187\nsamples = 37\nvalue = [26.094, 3.036]\nclass = y[0]'),
 Text(0.007350292602201955, 0.5694444444444444, 'node #84\nno_of_adults <= 1.5\ngini = 0.495\nsamples = 7\nvalue = [3.728, 3.036]\nclass = y[0]'),
 Text(0.007016188393010958, 0.5416666666666666, 'node #85\nmarket_segment_type_Corporate <= 0.5\ngini = 0.317\nsamples = 3\nvalue = [0.746, 3.036]\nclass = y[1]'),
 Text(0.00668208418381996, 0.5138888888888888, 'node #86\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.007350292602201955, 0.5138888888888888, 'node #87\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.007684396811392953, 0.5416666666666666, 'node #88\ngini = 0.0\nsamples = 4\nvalue = [2.982, 0.0]\nclass = y[0]'),
 Text(0.008018501020583952, 0.5694444444444444, 'node #89\ngini = 0.0\nsamples = 30\nvalue = [22.367, 0.0]\nclass = y[0]'),
 Text(0.009020813648156946, 0.5972222222222222, 'node #90\narrival_month <= 4.5\ngini = 0.317\nsamples = 3\nvalue = [0.746, 3.036]\nclass = y[1]'),
 Text(0.008686709438965948, 0.5694444444444444, 'node #91\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.009354917857347943, 0.5694444444444444, 'node #92\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.011025438903302933, 0.625, 'node #93\narrival_month <= 1.5\ngini = 0.489\nsamples = 45\nvalue = [20.13, 27.326]\nclass = y[1]'),
 Text(0.010357230484920937, 0.5972222222222222, 'node #94\navg_price_per_room <= 63.0\ngini = 0.429\nsamples = 22\nvalue = [13.42, 6.072]\nclass = y[0]'),
 Text(0.01002312627572994, 0.5694444444444444, 'node #95\navg_price_per_room <= 60.5\ngini = 0.393\nsamples = 7\nvalue = [2.237, 6.072]\nclass = y[1]'),
 Text(0.009689022066538941, 0.5416666666666666, 'node #96\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.010357230484920937, 0.5416666666666666, 'node #97\ngini = 0.0\nsamples = 4\nvalue = [0.0, 6.072]\nclass = y[1]'),
 Text(0.010691334694111935, 0.5694444444444444, 'node #98\ngini = 0.0\nsamples = 15\nvalue = [11.183, 0.0]\nclass = y[0]'),
 Text(0.01169364732168493, 0.5972222222222222, 'node #99\narrival_date <= 18.5\ngini = 0.365\nsamples = 23\nvalue = [6.71, 21.254]\nclass = y[1]'),
 Text(0.01135954311249393, 0.5694444444444444, 'node #100\ngini = 0.0\nsamples = 5\nvalue = [0.0, 7.591]\nclass = y[1]'),
 Text(0.012027751530875928, 0.5694444444444444, 'node #101\narrival_date <= 23.5\ngini = 0.442\nsamples = 18\nvalue = [6.71, 13.663]\nclass = y[1]'),
 Text(0.011192491007898433, 0.5416666666666666, 'node #102\narrival_date <= 19.5\ngini = 0.5\nsamples = 12\nvalue = [5.964, 6.072]\nclass = y[1]'),
 Text(0.010524282589516437, 0.5138888888888888, 'node #103\navg_price_per_room <= 97.5\ngini = 0.442\nsamples = 8\nvalue = [2.982, 6.072]\nclass = y[1]'),
 Text(0.010190178380325439, 0.4861111111111111, 'node #104\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.010858386798707435, 0.4861111111111111, 'node #105\navg_price_per_room <= 104.5\ngini = 0.5\nsamples = 6\nvalue = [2.982, 3.036]\nclass = y[1]'),
 Text(0.010524282589516437, 0.4583333333333333, 'node #106\ngini = 0.489\nsamples = 5\nvalue = [2.237, 3.036]\nclass = y[1]'),
 Text(0.011192491007898433, 0.4583333333333333, 'node #107\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.011860699426280428, 0.5138888888888888, 'node #108\narrival_month <= 3.0\ngini = 0.0\nsamples = 4\nvalue = [2.982, 0.0]\nclass = y[0]'),
 Text(0.01152659521708943, 0.4861111111111111, 'node #109\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.012194803635471426, 0.4861111111111111, 'node #110\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.012863012053853422, 0.5416666666666666, 'node #111\nno_of_previous_cancellations <= 0.5\ngini = 0.163\nsamples = 6\nvalue = [0.746, 7.591]\nclass = y[1]'),
 Text(0.012528907844662424, 0.5138888888888888, 'node #112\ngini = -0.0\nsamples = 5\nvalue = [0.0, 7.591]\nclass = y[1]'),
 Text(0.01319711626304442, 0.5138888888888888, 'node #113\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.01947723132018146, 0.6805555555555556, 'node #114\ntype_of_meal_plan_Not Selected <= 0.5\ngini = 0.08\nsamples = 525\nvalue = [383.214, 16.699]\nclass = y[0]'),
 Text(0.01757179325213905, 0.6527777777777778, 'node #115\nno_of_adults <= 2.5\ngini = 0.062\nsamples = 501\nvalue = [367.557, 12.145]\nclass = y[0]'),
 Text(0.015765542371200217, 0.625, 'node #116\nlead_time <= 23.5\ngini = 0.055\nsamples = 497\nvalue = [365.32, 10.627]\nclass = y[0]'),
 Text(0.014199428890617415, 0.5972222222222222, 'node #117\nlead_time <= 0.5\ngini = 0.015\nsamples = 271\nvalue = [201.299, 1.518]\nclass = y[0]'),
 Text(0.013865324681426415, 0.5694444444444444, 'node #118\narrival_month <= 3.5\ngini = 0.183\nsamples = 19\nvalue = [13.42, 1.518]\nclass = y[0]'),
 Text(0.013531220472235417, 0.5416666666666666, 'node #119\ngini = 0.0\nsamples = 16\nvalue = [11.929, 0.0]\nclass = y[0]'),
 Text(0.014199428890617415, 0.5416666666666666, 'node #120\narrival_date <= 13.0\ngini = 0.5\nsamples = 3\nvalue = [1.491, 1.518]\nclass = y[1]'),
 Text(0.013865324681426415, 0.5138888888888888, 'node #121\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.014533533099808413, 0.5138888888888888, 'node #122\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.014533533099808413, 0.5694444444444444, 'node #123\ngini = 0.0\nsamples = 252\nvalue = [187.879, 0.0]\nclass = y[0]'),
 Text(0.01733165585178302, 0.5972222222222222, 'node #124\navg_price_per_room <= 74.9\ngini = 0.1\nsamples = 226\nvalue = [164.021, 9.109]\nclass = y[0]'),
 Text(0.015953475988870154, 0.5694444444444444, 'node #125\navg_price_per_room <= 62.6\ngini = 0.237\nsamples = 69\nvalue = [47.715, 7.591]\nclass = y[0]'),
 Text(0.015619371779679156, 0.5416666666666666, 'node #126\ngini = 0.0\nsamples = 32\nvalue = [23.858, 0.0]\nclass = y[0]'),
 Text(0.01628758019806115, 0.5416666666666666, 'node #127\narrival_date <= 20.5\ngini = 0.366\nsamples = 37\nvalue = [23.858, 7.591]\nclass = y[0]'),
 Text(0.015201741518190409, 0.5138888888888888, 'node #128\nlead_time <= 43.0\ngini = 0.234\nsamples = 28\nvalue = [19.384, 3.036]\nclass = y[0]'),
 Text(0.014366480995212913, 0.4861111111111111, 'node #129\nroom_type_reserved_Room_Type 4 <= 0.5\ngini = 0.0\nsamples = 20\nvalue = [14.911, 0.0]\nclass = y[0]'),
 Text(0.014032376786021915, 0.4583333333333333, 'node #130\ngini = 0.0\nsamples = 18\nvalue = [13.42, 0.0]\nclass = y[0]'),
 Text(0.01470058520440391, 0.4583333333333333, 'node #131\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.016037002041167904, 0.4861111111111111, 'node #132\narrival_date <= 16.5\ngini = 0.482\nsamples = 8\nvalue = [4.473, 3.036]\nclass = y[0]'),
 Text(0.015368793622785907, 0.4583333333333333, 'node #133\navg_price_per_room <= 64.2\ngini = 0.442\nsamples = 4\nvalue = [1.491, 3.036]\nclass = y[1]'),
 Text(0.015034689413594909, 0.4305555555555556, 'node #134\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.015702897831976904, 0.4305555555555556, 'node #135\narrival_month <= 2.5\ngini = 0.317\nsamples = 3\nvalue = [0.746, 3.036]\nclass = y[1]'),
 Text(0.015368793622785907, 0.4027777777777778, 'node #136\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.016037002041167904, 0.4027777777777778, 'node #137\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.0167052104595499, 0.4583333333333333, 'node #138\ntotal_nights <= 3.5\ngini = 0.0\nsamples = 4\nvalue = [2.982, 0.0]\nclass = y[0]'),
 Text(0.0163711062503589, 0.4305555555555556, 'node #139\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.017039314668740896, 0.4305555555555556, 'node #140\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.017373418877931895, 0.5138888888888888, 'node #141\nlead_time <= 37.5\ngini = 0.5\nsamples = 9\nvalue = [4.473, 4.554]\nclass = y[1]'),
 Text(0.017039314668740896, 0.4861111111111111, 'node #142\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.01770752308712289, 0.4861111111111111, 'node #143\navg_price_per_room <= 64.0\ngini = 0.378\nsamples = 7\nvalue = [4.473, 1.518]\nclass = y[0]'),
 Text(0.017373418877931895, 0.4583333333333333, 'node #144\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.01804162729631389, 0.4583333333333333, 'node #145\ngini = 0.0\nsamples = 6\nvalue = [4.473, 0.0]\nclass = y[0]'),
 Text(0.018709835714695887, 0.5694444444444444, 'node #146\nlead_time <= 33.5\ngini = 0.025\nsamples = 157\nvalue = [116.306, 1.518]\nclass = y[0]'),
 Text(0.018375731505504887, 0.5416666666666666, 'node #147\nlead_time <= 32.5\ngini = 0.139\nsamples = 26\nvalue = [18.639, 1.518]\nclass = y[0]'),
 Text(0.01804162729631389, 0.5138888888888888, 'node #148\ngini = 0.0\nsamples = 24\nvalue = [17.893, 0.0]\nclass = y[0]'),
 Text(0.018709835714695887, 0.5138888888888888, 'node #149\narrival_month <= 3.5\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.018375731505504887, 0.4861111111111111, 'node #150\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.019043939923886886, 0.4861111111111111, 'node #151\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.019043939923886886, 0.5416666666666666, 'node #152\ngini = 0.0\nsamples = 131\nvalue = [97.667, 0.0]\nclass = y[0]'),
 Text(0.019378044133077883, 0.625, 'node #153\narrival_date <= 7.5\ngini = 0.482\nsamples = 4\nvalue = [2.237, 1.518]\nclass = y[0]'),
 Text(0.019043939923886886, 0.5972222222222222, 'node #154\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.019712148342268882, 0.5972222222222222, 'node #155\narrival_date <= 14.0\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.019378044133077883, 0.5694444444444444, 'node #156\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.02004625255145988, 0.5694444444444444, 'node #157\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.02138266938822387, 0.6527777777777778, 'node #158\navg_price_per_room <= 61.6\ngini = 0.349\nsamples = 24\nvalue = [15.657, 4.554]\nclass = y[0]'),
 Text(0.020714460969841874, 0.625, 'node #159\navg_price_per_room <= 56.4\ngini = 0.317\nsamples = 3\nvalue = [0.746, 3.036]\nclass = y[1]'),
 Text(0.020380356760650878, 0.5972222222222222, 'node #160\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.021048565179032874, 0.5972222222222222, 'node #161\ntotal_nights <= 2.0\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.020714460969841874, 0.5694444444444444, 'node #162\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.02138266938822387, 0.5694444444444444, 'node #163\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.022050877806605865, 0.625, 'node #164\navg_price_per_room <= 92.44\ngini = 0.168\nsamples = 21\nvalue = [14.911, 1.518]\nclass = y[0]'),
 Text(0.02171677359741487, 0.5972222222222222, 'node #165\ngini = -0.0\nsamples = 18\nvalue = [13.42, 0.0]\nclass = y[0]'),
 Text(0.022384982015796865, 0.5972222222222222, 'node #166\navg_price_per_room <= 118.44\ngini = 0.5\nsamples = 3\nvalue = [1.491, 1.518]\nclass = y[1]'),
 Text(0.022050877806605865, 0.5694444444444444, 'node #167\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.02271908622498786, 0.5694444444444444, 'node #168\ngini = -0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.024222555166347352, 0.7083333333333334, 'node #169\narrival_date <= 16.5\ngini = 0.486\nsamples = 81\nvalue = [44.733, 31.88]\nclass = y[0]'),
 Text(0.023888450957156356, 0.6805555555555556, 'node #170\ngini = 0.0\nsamples = 26\nvalue = [19.384, 0.0]\nclass = y[0]'),
 Text(0.024556659375538352, 0.6805555555555556, 'node #171\nno_of_adults <= 1.5\ngini = 0.493\nsamples = 55\nvalue = [25.349, 31.88]\nclass = y[1]'),
 Text(0.023721398852560856, 0.6527777777777778, 'node #172\nlead_time <= 65.5\ngini = 0.378\nsamples = 28\nvalue = [17.893, 6.072]\nclass = y[0]'),
 Text(0.02338729464336986, 0.625, 'node #173\nlead_time <= 63.5\ngini = 0.144\nsamples = 25\nvalue = [17.893, 1.518]\nclass = y[0]'),
 Text(0.02305319043417886, 0.5972222222222222, 'node #174\ngini = 0.0\nsamples = 6\nvalue = [4.473, 0.0]\nclass = y[0]'),
 Text(0.023721398852560856, 0.5972222222222222, 'node #175\ntotal_nights <= 2.5\ngini = 0.183\nsamples = 19\nvalue = [13.42, 1.518]\nclass = y[0]'),
 Text(0.02338729464336986, 0.5694444444444444, 'node #176\ngini = 0.21\nsamples = 16\nvalue = [11.183, 1.518]\nclass = y[0]'),
 Text(0.024055503061751856, 0.5694444444444444, 'node #177\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.024055503061751856, 0.625, 'node #178\ngini = -0.0\nsamples = 3\nvalue = [0.0, 4.554]\nclass = y[1]'),
 Text(0.025391919898515847, 0.6527777777777778, 'node #179\nlead_time <= 64.5\ngini = 0.348\nsamples = 27\nvalue = [7.456, 25.808]\nclass = y[1]'),
 Text(0.02472371148013385, 0.625, 'node #180\navg_price_per_room <= 69.29\ngini = 0.103\nsamples = 19\nvalue = [1.491, 25.808]\nclass = y[1]'),
 Text(0.024389607270942852, 0.5972222222222222, 'node #181\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.025057815689324848, 0.5972222222222222, 'node #182\narrival_date <= 25.5\ngini = 0.055\nsamples = 18\nvalue = [0.746, 25.808]\nclass = y[1]'),
 Text(0.02472371148013385, 0.5694444444444444, 'node #183\ngini = 0.0\nsamples = 17\nvalue = [0.0, 25.808]\nclass = y[1]'),
 Text(0.025391919898515847, 0.5694444444444444, 'node #184\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.026060128316897843, 0.625, 'node #185\navg_price_per_room <= 49.085\ngini = 0.0\nsamples = 8\nvalue = [5.964, 0.0]\nclass = y[0]'),
 Text(0.025726024107706844, 0.5972222222222222, 'node #186\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.02639423252608884, 0.5972222222222222, 'node #187\ngini = 0.0\nsamples = 7\nvalue = [5.219, 0.0]\nclass = y[0]'),
 Text(0.0287747250165747, 0.7361111111111112, 'node #188\navg_price_per_room <= 61.0\ngini = 0.499\nsamples = 110\nvalue = [53.68, 57.688]\nclass = y[1]'),
 Text(0.02773064936285283, 0.7083333333333334, 'node #189\navg_price_per_room <= 59.75\ngini = 0.252\nsamples = 46\nvalue = [8.947, 51.616]\nclass = y[1]'),
 Text(0.027396545153661835, 0.6805555555555556, 'node #190\narrival_date <= 28.5\ngini = 0.248\nsamples = 13\nvalue = [8.947, 1.518]\nclass = y[0]'),
 Text(0.027062440944470835, 0.6527777777777778, 'node #191\navg_price_per_room <= 41.085\ngini = 0.482\nsamples = 4\nvalue = [2.237, 1.518]\nclass = y[0]'),
 Text(0.02672833673527984, 0.625, 'node #192\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.027396545153661835, 0.625, 'node #193\nlead_time <= 27.5\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.027062440944470835, 0.5972222222222222, 'node #194\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.02773064936285283, 0.5972222222222222, 'node #195\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.02773064936285283, 0.6527777777777778, 'node #196\ngini = 0.0\nsamples = 9\nvalue = [6.71, 0.0]\nclass = y[0]'),
 Text(0.02806475357204383, 0.6805555555555556, 'node #197\ngini = -0.0\nsamples = 33\nvalue = [0.0, 50.098]\nclass = y[1]'),
 Text(0.02981880067029657, 0.7083333333333334, 'node #198\narrival_date <= 29.5\ngini = 0.21\nsamples = 64\nvalue = [44.733, 6.072]\nclass = y[0]'),
 Text(0.028732961990425826, 0.6805555555555556, 'node #199\ntype_of_meal_plan_Meal Plan 2 <= 0.5\ngini = 0.0\nsamples = 36\nvalue = [26.84, 0.0]\nclass = y[0]'),
 Text(0.02839885778123483, 0.6527777777777778, 'node #200\ngini = 0.0\nsamples = 35\nvalue = [26.094, 0.0]\nclass = y[0]'),
 Text(0.029067066199616826, 0.6527777777777778, 'node #201\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.030904639350167313, 0.6805555555555556, 'node #202\navg_price_per_room <= 87.25\ngini = 0.378\nsamples = 28\nvalue = [17.893, 6.072]\nclass = y[0]'),
 Text(0.02973527461799882, 0.6527777777777778, 'node #203\navg_price_per_room <= 73.0\ngini = 0.183\nsamples = 19\nvalue = [13.42, 1.518]\nclass = y[0]'),
 Text(0.02940117040880782, 0.625, 'node #204\ngini = 0.0\nsamples = 11\nvalue = [8.201, 0.0]\nclass = y[0]'),
 Text(0.030069378827189817, 0.625, 'node #205\navg_price_per_room <= 79.5\ngini = 0.349\nsamples = 8\nvalue = [5.219, 1.518]\nclass = y[0]'),
 Text(0.02973527461799882, 0.5972222222222222, 'node #206\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.030403483036380817, 0.5972222222222222, 'node #207\ntotal_nights <= 3.0\ngini = 0.0\nsamples = 7\nvalue = [5.219, 0.0]\nclass = y[0]'),
 Text(0.030069378827189817, 0.5694444444444444, 'node #208\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.030737587245571813, 0.5694444444444444, 'node #209\ngini = 0.0\nsamples = 5\nvalue = [3.728, 0.0]\nclass = y[0]'),
 Text(0.03207400408233581, 0.6527777777777778, 'node #210\nno_of_adults <= 1.5\ngini = 0.5\nsamples = 9\nvalue = [4.473, 4.554]\nclass = y[1]'),
 Text(0.03140579566395381, 0.625, 'node #211\nlead_time <= 7.5\ngini = 0.442\nsamples = 6\nvalue = [2.237, 4.554]\nclass = y[1]'),
 Text(0.031071691454762813, 0.5972222222222222, 'node #212\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.03173989987314481, 0.5972222222222222, 'node #213\narrival_month <= 4.0\ngini = 0.0\nsamples = 3\nvalue = [0.0, 4.554]\nclass = y[1]'),
 Text(0.03140579566395381, 0.5694444444444444, 'node #214\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.03207400408233581, 0.5694444444444444, 'node #215\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.0327422125007178, 0.625, 'node #216\narrival_month <= 4.5\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.0324081082915268, 0.5972222222222222, 'node #217\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.0330763167099088, 0.5972222222222222, 'node #218\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.056023141936864744, 0.7638888888888888, 'node #219\nmarket_segment_type_Offline <= 0.5\ngini = 0.144\nsamples = 2555\nvalue = [1828.838, 154.848]\nclass = y[0]'),
 Text(0.04939130389388015, 0.7361111111111112, 'node #220\nrepeated_guest <= 0.5\ngini = 0.264\nsamples = 744\nvalue = [508.466, 94.123]\nclass = y[0]'),
 Text(0.049057199684689155, 0.7083333333333334, 'node #221\nroom_type_reserved_Room_Type 4 <= 0.5\ngini = 0.32\nsamples = 566\nvalue = [375.758, 94.123]\nclass = y[0]'),
 Text(0.041572482342070505, 0.6805555555555556, 'node #222\navg_price_per_room <= 61.0\ngini = 0.281\nsamples = 507\nvalue = [343.699, 69.833]\nclass = y[0]'),
 Text(0.041238378132879505, 0.6527777777777778, 'node #223\ngini = 0.0\nsamples = 75\nvalue = [55.916, 0.0]\nclass = y[0]'),
 Text(0.041906586551261504, 0.6527777777777778, 'node #224\narrival_month <= 11.5\ngini = 0.314\nsamples = 432\nvalue = [287.783, 69.833]\nclass = y[0]'),
 Text(0.041572482342070505, 0.625, 'node #225\ntotal_nights <= 3.5\ngini = 0.344\nsamples = 377\nvalue = [246.778, 69.833]\nclass = y[0]'),
 Text(0.0337445251282908, 0.5972222222222222, 'node #226\navg_price_per_room <= 121.5\ngini = 0.321\nsamples = 346\nvalue = [229.63, 57.688]\nclass = y[0]'),
 Text(0.0334104209190998, 0.5694444444444444, 'node #227\narrival_date <= 6.5\ngini = 0.348\nsamples = 306\nvalue = [199.808, 57.688]\nclass = y[0]'),
 Text(0.026096670964778107, 0.5416666666666666, 'node #228\navg_price_per_room <= 109.5\ngini = 0.158\nsamples = 45\nvalue = [32.059, 3.036]\nclass = y[0]'),
 Text(0.02542846254639611, 0.5138888888888888, 'node #229\narrival_month <= 10.5\ngini = 0.088\nsamples = 43\nvalue = [31.313, 1.518]\nclass = y[0]'),
 Text(0.025094358337205115, 0.4861111111111111, 'node #230\ngini = -0.0\nsamples = 37\nvalue = [27.585, 0.0]\nclass = y[0]'),
 Text(0.02576256675558711, 0.4861111111111111, 'node #231\narrival_date <= 4.0\ngini = 0.411\nsamples = 6\nvalue = [3.728, 1.518]\nclass = y[0]'),
 Text(0.02542846254639611, 0.4583333333333333, 'node #232\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.026096670964778107, 0.4583333333333333, 'node #233\nlead_time <= 11.5\ngini = 0.5\nsamples = 3\nvalue = [1.491, 1.518]\nclass = y[1]'),
 Text(0.02576256675558711, 0.4305555555555556, 'node #234\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.026430775173969107, 0.4305555555555556, 'node #235\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.026764879383160103, 0.5138888888888888, 'node #236\nlead_time <= 8.0\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.026430775173969107, 0.4861111111111111, 'node #237\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.027098983592351102, 0.4861111111111111, 'node #238\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.040724170873421485, 0.5416666666666666, 'node #239\nno_of_adults <= 1.5\ngini = 0.371\nsamples = 261\nvalue = [167.749, 54.652]\nclass = y[0]'),
 Text(0.035973626648986984, 0.5138888888888888, 'node #240\narrival_date <= 19.5\ngini = 0.331\nsamples = 217\nvalue = [143.146, 37.953]\nclass = y[0]'),
 Text(0.0310664710764942, 0.4861111111111111, 'node #241\narrival_date <= 14.5\ngini = 0.39\nsamples = 126\nvalue = [79.774, 28.844]\nclass = y[0]'),
 Text(0.028351874376817344, 0.4583333333333333, 'node #242\narrival_month <= 8.5\ngini = 0.287\nsamples = 75\nvalue = [50.698, 10.627]\nclass = y[0]'),
 Text(0.027098983592351102, 0.4305555555555556, 'node #243\nlead_time <= 9.5\ngini = 0.453\nsamples = 29\nvalue = [17.148, 9.109]\nclass = y[0]'),
 Text(0.026263723069373607, 0.4027777777777778, 'node #244\narrival_date <= 9.5\ngini = 0.499\nsamples = 13\nvalue = [6.71, 6.072]\nclass = y[0]'),
 Text(0.02559551465099161, 0.375, 'node #245\narrival_date <= 7.5\ngini = 0.317\nsamples = 3\nvalue = [0.746, 3.036]\nclass = y[1]'),
 Text(0.02526141044180061, 0.3472222222222222, 'node #246\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.025929618860182607, 0.3472222222222222, 'node #247\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.026931931487755603, 0.375, 'node #248\narrival_month <= 6.5\ngini = 0.447\nsamples = 10\nvalue = [5.964, 3.036]\nclass = y[0]'),
 Text(0.026597827278564606, 0.3472222222222222, 'node #249\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.027266035696946602, 0.3472222222222222, 'node #250\ntotal_nights <= 1.5\ngini = 0.323\nsamples = 9\nvalue = [5.964, 1.518]\nclass = y[0]'),
 Text(0.026931931487755603, 0.3194444444444444, 'node #251\ngini = 0.0\nsamples = 7\nvalue = [5.219, 0.0]\nclass = y[0]'),
 Text(0.027600139906137598, 0.3194444444444444, 'node #252\navg_price_per_room <= 94.075\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.027266035696946602, 0.2916666666666667, 'node #253\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.027934244115328598, 0.2916666666666667, 'node #254\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.027934244115328598, 0.4027777777777778, 'node #255\narrival_date <= 7.5\ngini = 0.349\nsamples = 16\nvalue = [10.438, 3.036]\nclass = y[0]'),
 Text(0.027600139906137598, 0.375, 'node #256\ngini = 0.465\nsamples = 9\nvalue = [5.219, 3.036]\nclass = y[0]'),
 Text(0.028268348324519594, 0.375, 'node #257\ngini = 0.0\nsamples = 7\nvalue = [5.219, 0.0]\nclass = y[0]'),
 Text(0.029604765161283585, 0.4305555555555556, 'node #258\nlead_time <= 3.5\ngini = 0.083\nsamples = 46\nvalue = [33.55, 1.518]\nclass = y[0]'),
 Text(0.02927066095209259, 0.4027777777777778, 'node #259\nlead_time <= 2.5\ngini = 0.191\nsamples = 18\nvalue = [12.674, 1.518]\nclass = y[0]'),
 Text(0.02893655674290159, 0.375, 'node #260\ngini = 0.0\nsamples = 17\nvalue = [12.674, 0.0]\nclass = y[0]'),
 Text(0.029604765161283585, 0.375, 'node #261\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.029938869370474585, 0.4027777777777778, 'node #262\ngini = -0.0\nsamples = 28\nvalue = [20.875, 0.0]\nclass = y[0]'),
 Text(0.03378106777617106, 0.4583333333333333, 'node #263\nlead_time <= 22.5\ngini = 0.474\nsamples = 51\nvalue = [29.077, 18.217]\nclass = y[0]'),
 Text(0.03160939041642957, 0.4305555555555556, 'node #264\narrival_month <= 10.5\ngini = 0.426\nsamples = 39\nvalue = [23.858, 10.627]\nclass = y[0]'),
 Text(0.03060707778885658, 0.4027777777777778, 'node #265\narrival_month <= 8.5\ngini = 0.281\nsamples = 22\nvalue = [14.911, 3.036]\nclass = y[0]'),
 Text(0.03027297357966558, 0.375, 'node #266\narrival_year <= 2017.5\ngini = 0.482\nsamples = 8\nvalue = [4.473, 3.036]\nclass = y[0]'),
 Text(0.029604765161283585, 0.3472222222222222, 'node #267\navg_price_per_room <= 66.0\ngini = 0.489\nsamples = 5\nvalue = [2.237, 3.036]\nclass = y[1]'),
 Text(0.02927066095209259, 0.3194444444444444, 'node #268\ngini = -0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.029938869370474585, 0.3194444444444444, 'node #269\ntotal_nights <= 1.5\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.029604765161283585, 0.2916666666666667, 'node #270\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.03027297357966558, 0.2916666666666667, 'node #271\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.03094118199804758, 0.3472222222222222, 'node #272\nlead_time <= 3.0\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.03060707778885658, 0.3194444444444444, 'node #273\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.03127528620723858, 0.3194444444444444, 'node #274\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.03094118199804758, 0.375, 'node #275\ngini = -0.0\nsamples = 14\nvalue = [10.438, 0.0]\nclass = y[0]'),
 Text(0.03261170304400257, 0.4027777777777778, 'node #276\nlead_time <= 19.5\ngini = 0.497\nsamples = 17\nvalue = [8.947, 7.591]\nclass = y[0]'),
 Text(0.03194349462562057, 0.375, 'node #277\narrival_date <= 16.5\ngini = 0.442\nsamples = 8\nvalue = [2.982, 6.072]\nclass = y[1]'),
 Text(0.03160939041642957, 0.3472222222222222, 'node #278\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.03227759883481157, 0.3472222222222222, 'node #279\navg_price_per_room <= 66.0\ngini = 0.317\nsamples = 6\nvalue = [1.491, 6.072]\nclass = y[1]'),
 Text(0.03194349462562057, 0.3194444444444444, 'node #280\ngini = 0.442\nsamples = 4\nvalue = [1.491, 3.036]\nclass = y[1]'),
 Text(0.03261170304400257, 0.3194444444444444, 'node #281\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.033279911462384563, 0.375, 'node #282\navg_price_per_room <= 66.5\ngini = 0.323\nsamples = 9\nvalue = [5.964, 1.518]\nclass = y[0]'),
 Text(0.032945807253193564, 0.3472222222222222, 'node #283\ngini = 0.349\nsamples = 8\nvalue = [5.219, 1.518]\nclass = y[0]'),
 Text(0.03361401567157556, 0.3472222222222222, 'node #284\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.035952745135912546, 0.4305555555555556, 'node #285\nlead_time <= 38.0\ngini = 0.483\nsamples = 12\nvalue = [5.219, 7.591]\nclass = y[1]'),
 Text(0.035284536717530554, 0.4027777777777778, 'node #286\ntotal_nights <= 2.5\ngini = 0.405\nsamples = 9\nvalue = [2.982, 7.591]\nclass = y[1]'),
 Text(0.034950432508339555, 0.375, 'node #287\narrival_date <= 15.5\ngini = 0.5\nsamples = 6\nvalue = [2.982, 3.036]\nclass = y[1]'),
 Text(0.034282224089957555, 0.3472222222222222, 'node #288\narrival_month <= 9.5\ngini = 0.317\nsamples = 3\nvalue = [0.746, 3.036]\nclass = y[1]'),
 Text(0.03394811988076656, 0.3194444444444444, 'node #289\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.034616328299148555, 0.3194444444444444, 'node #290\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.035618640926721554, 0.3472222222222222, 'node #291\narrival_month <= 8.0\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.035284536717530554, 0.3194444444444444, 'node #292\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.035952745135912546, 0.3194444444444444, 'node #293\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.035618640926721554, 0.375, 'node #294\ngini = 0.0\nsamples = 3\nvalue = [0.0, 4.554]\nclass = y[1]'),
 Text(0.036620953554294546, 0.4027777777777778, 'node #295\nlead_time <= 48.0\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.036286849345103546, 0.375, 'node #296\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.036955057763485545, 0.375, 'node #297\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.04088078222147977, 0.4861111111111111, 'node #298\nlead_time <= 43.5\ngini = 0.22\nsamples = 91\nvalue = [63.372, 9.109]\nclass = y[0]'),
 Text(0.04004552169850227, 0.4583333333333333, 'node #299\navg_price_per_room <= 107.5\ngini = 0.193\nsamples = 89\nvalue = [62.626, 7.591]\nclass = y[0]'),
 Text(0.03937731328012028, 0.4305555555555556, 'node #300\navg_price_per_room <= 81.5\ngini = 0.163\nsamples = 87\nvalue = [61.881, 6.072]\nclass = y[0]'),
 Text(0.03904320907092928, 0.4027777777777778, 'node #301\ntotal_nights <= 1.5\ngini = 0.216\nsamples = 62\nvalue = [43.242, 6.072]\nclass = y[0]'),
 Text(0.03762326618186754, 0.375, 'node #302\nlead_time <= 18.0\ngini = 0.097\nsamples = 39\nvalue = [28.331, 1.518]\nclass = y[0]'),
 Text(0.03728916197267654, 0.3472222222222222, 'node #303\ngini = 0.0\nsamples = 35\nvalue = [26.094, 0.0]\nclass = y[0]'),
 Text(0.03795737039105854, 0.3472222222222222, 'node #304\nlead_time <= 23.5\ngini = 0.482\nsamples = 4\nvalue = [2.237, 1.518]\nclass = y[0]'),
 Text(0.03762326618186754, 0.3194444444444444, 'node #305\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.03829147460024954, 0.3194444444444444, 'node #306\ngini = -0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.04046315195999102, 0.375, 'node #307\nlead_time <= 5.5\ngini = 0.358\nsamples = 23\nvalue = [14.911, 4.554]\nclass = y[0]'),
 Text(0.03962789143701353, 0.3472222222222222, 'node #308\ntotal_nights <= 2.5\ngini = 0.498\nsamples = 10\nvalue = [5.219, 4.554]\nclass = y[0]'),
 Text(0.03895968301863153, 0.3194444444444444, 'node #309\narrival_date <= 23.0\ngini = 0.442\nsamples = 6\nvalue = [2.237, 4.554]\nclass = y[1]'),
 Text(0.03862557880944053, 0.2916666666666667, 'node #310\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.03929378722782253, 0.2916666666666667, 'node #311\narrival_month <= 7.5\ngini = 0.372\nsamples = 5\nvalue = [1.491, 4.554]\nclass = y[1]'),
 Text(0.03895968301863153, 0.2638888888888889, 'node #312\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.03962789143701353, 0.2638888888888889, 'node #313\nlead_time <= 2.5\ngini = 0.242\nsamples = 4\nvalue = [0.746, 4.554]\nclass = y[1]'),
 Text(0.03929378722782253, 0.2361111111111111, 'node #314\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.03996199564620452, 0.2361111111111111, 'node #315\narrival_month <= 8.5\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.03962789143701353, 0.20833333333333334, 'node #316\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.04029609985539552, 0.20833333333333334, 'node #317\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.04029609985539552, 0.3194444444444444, 'node #318\navg_price_per_room <= 66.0\ngini = 0.0\nsamples = 4\nvalue = [2.982, 0.0]\nclass = y[0]'),
 Text(0.03996199564620452, 0.2916666666666667, 'node #319\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.04063020406458652, 0.2916666666666667, 'node #320\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.04129841248296852, 0.3472222222222222, 'node #321\narrival_date <= 22.5\ngini = 0.0\nsamples = 13\nvalue = [9.692, 0.0]\nclass = y[0]'),
 Text(0.04096430827377752, 0.3194444444444444, 'node #322\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.04163251669215951, 0.3194444444444444, 'node #323\ngini = 0.0\nsamples = 12\nvalue = [8.947, 0.0]\nclass = y[0]'),
 Text(0.03971141748931128, 0.4027777777777778, 'node #324\ngini = 0.0\nsamples = 25\nvalue = [18.639, 0.0]\nclass = y[0]'),
 Text(0.04071373011688427, 0.4305555555555556, 'node #325\navg_price_per_room <= 115.0\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.04037962590769327, 0.4027777777777778, 'node #326\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.04104783432607527, 0.4027777777777778, 'node #327\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.04171604274445726, 0.4583333333333333, 'node #328\narrival_month <= 8.5\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.04138193853526626, 0.4305555555555556, 'node #329\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.04205014695364826, 0.4305555555555556, 'node #330\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.04547471509785599, 0.5138888888888888, 'node #331\narrival_date <= 18.5\ngini = 0.482\nsamples = 44\nvalue = [24.603, 16.699]\nclass = y[0]'),
 Text(0.04422182431338975, 0.4861111111111111, 'node #332\narrival_date <= 16.5\ngini = 0.474\nsamples = 16\nvalue = [6.71, 10.627]\nclass = y[1]'),
 Text(0.04338656379041225, 0.4583333333333333, 'node #333\narrival_date <= 10.0\ngini = 0.447\nsamples = 10\nvalue = [5.964, 3.036]\nclass = y[0]'),
 Text(0.04271835537203026, 0.4305555555555556, 'node #334\narrival_year <= 2017.5\ngini = 0.317\nsamples = 3\nvalue = [0.746, 3.036]\nclass = y[1]'),
 Text(0.04238425116283926, 0.4027777777777778, 'node #335\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.04305245958122125, 0.4027777777777778, 'node #336\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.04405477220879425, 0.4305555555555556, 'node #337\navg_price_per_room <= 70.0\ngini = 0.0\nsamples = 7\nvalue = [5.219, 0.0]\nclass = y[0]'),
 Text(0.04372066799960325, 0.4027777777777778, 'node #338\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.044388876417985244, 0.4027777777777778, 'node #339\ngini = 0.0\nsamples = 6\nvalue = [4.473, 0.0]\nclass = y[0]'),
 Text(0.045057084836367244, 0.4583333333333333, 'node #340\narrival_month <= 9.5\ngini = 0.163\nsamples = 6\nvalue = [0.746, 7.591]\nclass = y[1]'),
 Text(0.044722980627176244, 0.4305555555555556, 'node #341\ngini = -0.0\nsamples = 5\nvalue = [0.0, 7.591]\nclass = y[1]'),
 Text(0.04539118904555824, 0.4305555555555556, 'node #342\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.046727605882322235, 0.4861111111111111, 'node #343\narrival_date <= 24.0\ngini = 0.378\nsamples = 28\nvalue = [17.893, 6.072]\nclass = y[0]'),
 Text(0.046393501673131235, 0.4583333333333333, 'node #344\narrival_date <= 22.5\ngini = 0.447\nsamples = 20\nvalue = [11.929, 6.072]\nclass = y[0]'),
 Text(0.046059397463940235, 0.4305555555555556, 'node #345\narrival_month <= 7.5\ngini = 0.4\nsamples = 19\nvalue = [11.929, 4.554]\nclass = y[0]'),
 Text(0.045725293254749236, 0.4027777777777778, 'node #346\narrival_date <= 21.5\ngini = 0.498\nsamples = 10\nvalue = [5.219, 4.554]\nclass = y[0]'),
 Text(0.045057084836367244, 0.375, 'node #347\narrival_month <= 6.5\ngini = 0.378\nsamples = 7\nvalue = [4.473, 1.518]\nclass = y[0]'),
 Text(0.044722980627176244, 0.3472222222222222, 'node #348\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.04539118904555824, 0.3472222222222222, 'node #349\ngini = 0.411\nsamples = 6\nvalue = [3.728, 1.518]\nclass = y[0]'),
 Text(0.046393501673131235, 0.375, 'node #350\ntotal_nights <= 1.5\ngini = 0.317\nsamples = 3\nvalue = [0.746, 3.036]\nclass = y[1]'),
 Text(0.046059397463940235, 0.3472222222222222, 'node #351\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.046727605882322235, 0.3472222222222222, 'node #352\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.046393501673131235, 0.4027777777777778, 'node #353\ngini = 0.0\nsamples = 9\nvalue = [6.71, 0.0]\nclass = y[0]'),
 Text(0.046727605882322235, 0.4305555555555556, 'node #354\ngini = -0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.047061710091513234, 0.4583333333333333, 'node #355\ngini = 0.0\nsamples = 8\nvalue = [5.964, 0.0]\nclass = y[0]'),
 Text(0.03407862933748179, 0.5694444444444444, 'node #356\ngini = 0.0\nsamples = 40\nvalue = [29.822, 0.0]\nclass = y[0]'),
 Text(0.04940043955585022, 0.5972222222222222, 'node #357\nlead_time <= 10.5\ngini = 0.485\nsamples = 31\nvalue = [17.148, 12.145]\nclass = y[0]'),
 Text(0.048064022719086226, 0.5694444444444444, 'node #358\nroom_type_reserved_Room_Type 5 <= 0.5\ngini = 0.385\nsamples = 12\nvalue = [3.728, 10.627]\nclass = y[1]'),
 Text(0.04739581430070423, 0.5416666666666666, 'node #359\nlead_time <= 3.5\ngini = 0.216\nsamples = 9\nvalue = [1.491, 10.627]\nclass = y[1]'),
 Text(0.047061710091513234, 0.5138888888888888, 'node #360\ngini = 0.0\nsamples = 5\nvalue = [0.0, 7.591]\nclass = y[1]'),
 Text(0.047729918509895226, 0.5138888888888888, 'node #361\narrival_month <= 9.5\ngini = 0.442\nsamples = 4\nvalue = [1.491, 3.036]\nclass = y[1]'),
 Text(0.04739581430070423, 0.4861111111111111, 'node #362\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.048064022719086226, 0.4861111111111111, 'node #363\narrival_date <= 10.0\ngini = 0.317\nsamples = 3\nvalue = [0.746, 3.036]\nclass = y[1]'),
 Text(0.047729918509895226, 0.4583333333333333, 'node #364\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.048398126928277226, 0.4583333333333333, 'node #365\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.04873223113746822, 0.5416666666666666, 'node #366\nlead_time <= 3.5\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.048398126928277226, 0.5138888888888888, 'node #367\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.04906633534665922, 0.5138888888888888, 'node #368\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.05073685639261421, 0.5694444444444444, 'node #369\narrival_month <= 9.5\ngini = 0.183\nsamples = 19\nvalue = [13.42, 1.518]\nclass = y[0]'),
 Text(0.05006864797423221, 0.5416666666666666, 'node #370\nlead_time <= 12.5\ngini = 0.447\nsamples = 5\nvalue = [2.982, 1.518]\nclass = y[0]'),
 Text(0.04973454376504122, 0.5138888888888888, 'node #371\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.05040275218342321, 0.5138888888888888, 'node #372\ngini = 0.0\nsamples = 4\nvalue = [2.982, 0.0]\nclass = y[0]'),
 Text(0.05140506481099621, 0.5416666666666666, 'node #373\narrival_month <= 10.5\ngini = 0.0\nsamples = 14\nvalue = [10.438, 0.0]\nclass = y[0]'),
 Text(0.05107096060180521, 0.5138888888888888, 'node #374\ngini = 0.0\nsamples = 11\nvalue = [8.201, 0.0]\nclass = y[0]'),
 Text(0.0517391690201872, 0.5138888888888888, 'node #375\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.042240690760452504, 0.625, 'node #376\ngini = 0.0\nsamples = 55\nvalue = [41.005, 0.0]\nclass = y[0]'),
 Text(0.0565419170273078, 0.6805555555555556, 'node #377\nlead_time <= 9.5\ngini = 0.49\nsamples = 59\nvalue = [32.059, 24.29]\nclass = y[0]'),
 Text(0.05499668505979943, 0.6527777777777778, 'node #378\narrival_date <= 13.0\ngini = 0.411\nsamples = 42\nvalue = [26.094, 10.627]\nclass = y[0]'),
 Text(0.05466258085060843, 0.625, 'node #379\nlead_time <= 4.5\ngini = 0.5\nsamples = 21\nvalue = [10.438, 10.627]\nclass = y[1]'),
 Text(0.05324263796154669, 0.5972222222222222, 'node #380\narrival_month <= 9.5\ngini = 0.447\nsamples = 15\nvalue = [8.947, 4.554]\nclass = y[0]'),
 Text(0.05290853375235569, 0.5694444444444444, 'node #381\ngini = 0.0\nsamples = 5\nvalue = [3.728, 0.0]\nclass = y[0]'),
 Text(0.05357674217073769, 0.5694444444444444, 'node #382\narrival_date <= 8.5\ngini = 0.498\nsamples = 10\nvalue = [5.219, 4.554]\nclass = y[0]'),
 Text(0.0527414816477602, 0.5416666666666666, 'node #383\narrival_month <= 10.5\ngini = 0.317\nsamples = 3\nvalue = [0.746, 3.036]\nclass = y[1]'),
 Text(0.0524073774385692, 0.5138888888888888, 'node #384\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.05307558585695119, 0.5138888888888888, 'node #385\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.054412002693715183, 0.5416666666666666, 'node #386\ntotal_nights <= 1.5\ngini = 0.378\nsamples = 7\nvalue = [4.473, 1.518]\nclass = y[0]'),
 Text(0.05374379427533319, 0.5138888888888888, 'node #387\narrival_month <= 10.5\ngini = 0.482\nsamples = 4\nvalue = [2.237, 1.518]\nclass = y[0]'),
 Text(0.05340969006614219, 0.4861111111111111, 'node #388\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.05407789848452419, 0.4861111111111111, 'node #389\ngini = 0.5\nsamples = 3\nvalue = [1.491, 1.518]\nclass = y[1]'),
 Text(0.05508021111209718, 0.5138888888888888, 'node #390\narrival_date <= 11.0\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.05474610690290618, 0.4861111111111111, 'node #391\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.05541431532128818, 0.4861111111111111, 'node #392\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.056082523739670175, 0.5972222222222222, 'node #393\nlead_time <= 5.5\ngini = 0.317\nsamples = 6\nvalue = [1.491, 6.072]\nclass = y[1]'),
 Text(0.05574841953047918, 0.5694444444444444, 'node #394\narrival_date <= 9.0\ngini = 0.195\nsamples = 5\nvalue = [0.746, 6.072]\nclass = y[1]'),
 Text(0.05541431532128818, 0.5416666666666666, 'node #395\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.056082523739670175, 0.5416666666666666, 'node #396\narrival_month <= 9.5\ngini = 0.317\nsamples = 3\nvalue = [0.746, 3.036]\nclass = y[1]'),
 Text(0.05574841953047918, 0.5138888888888888, 'node #397\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.056416627948861174, 0.5138888888888888, 'node #398\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.056416627948861174, 0.5694444444444444, 'node #399\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.05533078926899043, 0.625, 'node #400\ngini = -0.0\nsamples = 21\nvalue = [15.657, 0.0]\nclass = y[0]'),
 Text(0.058087148994816165, 0.6527777777777778, 'node #401\nno_of_adults <= 1.5\ngini = 0.423\nsamples = 17\nvalue = [5.964, 13.663]\nclass = y[1]'),
 Text(0.057753044785625166, 0.625, 'node #402\narrival_date <= 16.5\ngini = 0.5\nsamples = 12\nvalue = [5.964, 6.072]\nclass = y[1]'),
 Text(0.057418940576434166, 0.5972222222222222, 'node #403\nlead_time <= 12.0\ngini = 0.0\nsamples = 8\nvalue = [5.964, 0.0]\nclass = y[0]'),
 Text(0.05708483636724317, 0.5694444444444444, 'node #404\ngini = 0.0\nsamples = 6\nvalue = [4.473, 0.0]\nclass = y[0]'),
 Text(0.057753044785625166, 0.5694444444444444, 'node #405\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.058087148994816165, 0.5972222222222222, 'node #406\ngini = 0.0\nsamples = 4\nvalue = [0.0, 6.072]\nclass = y[1]'),
 Text(0.058421253204007165, 0.625, 'node #407\ngini = -0.0\nsamples = 5\nvalue = [0.0, 7.591]\nclass = y[1]'),
 Text(0.04972540810307115, 0.7083333333333334, 'node #408\ngini = -0.0\nsamples = 178\nvalue = [132.708, 0.0]\nclass = y[0]'),
 Text(0.06265497997984934, 0.7361111111111112, 'node #409\navg_price_per_room <= 50.0\ngini = 0.084\nsamples = 1811\nvalue = [1320.372, 60.725]\nclass = y[0]'),
 Text(0.05942356583158016, 0.7083333333333334, 'node #410\narrival_month <= 9.5\ngini = 0.447\nsamples = 30\nvalue = [17.893, 9.109]\nclass = y[0]'),
 Text(0.05908946162238916, 0.6805555555555556, 'node #411\ntotal_nights <= 2.5\ngini = 0.412\nsamples = 11\nvalue = [3.728, 9.109]\nclass = y[1]'),
 Text(0.05875535741319816, 0.6527777777777778, 'node #412\ngini = 0.0\nsamples = 6\nvalue = [0.0, 9.109]\nclass = y[1]'),
 Text(0.05942356583158016, 0.6527777777777778, 'node #413\narrival_date <= 8.0\ngini = 0.0\nsamples = 5\nvalue = [3.728, 0.0]\nclass = y[0]'),
 Text(0.05908946162238916, 0.625, 'node #414\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.059757670040771156, 0.625, 'node #415\ngini = 0.0\nsamples = 4\nvalue = [2.982, 0.0]\nclass = y[0]'),
 Text(0.059757670040771156, 0.6805555555555556, 'node #416\ngini = -0.0\nsamples = 19\nvalue = [14.165, 0.0]\nclass = y[0]'),
 Text(0.06588639412811852, 0.7083333333333334, 'node #417\narrival_date <= 1.5\ngini = 0.073\nsamples = 1781\nvalue = [1302.479, 51.616]\nclass = y[0]'),
 Text(0.06142819108672615, 0.6805555555555556, 'node #418\navg_price_per_room <= 89.75\ngini = 0.386\nsamples = 27\nvalue = [17.148, 6.072]\nclass = y[0]'),
 Text(0.06075998266834415, 0.6527777777777778, 'node #419\narrival_month <= 8.5\ngini = 0.0\nsamples = 21\nvalue = [15.657, 0.0]\nclass = y[0]'),
 Text(0.06042587845915315, 0.625, 'node #420\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.06109408687753515, 0.625, 'node #421\ngini = 0.0\nsamples = 18\nvalue = [13.42, 0.0]\nclass = y[0]'),
 Text(0.06209639950510814, 0.6527777777777778, 'node #422\nno_of_adults <= 1.5\ngini = 0.317\nsamples = 6\nvalue = [1.491, 6.072]\nclass = y[1]'),
 Text(0.06176229529591714, 0.625, 'node #423\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.06243050371429914, 0.625, 'node #424\ngini = 0.0\nsamples = 4\nvalue = [0.0, 6.072]\nclass = y[1]'),
 Text(0.0703445971695109, 0.6805555555555556, 'node #425\navg_price_per_room <= 85.55\ngini = 0.066\nsamples = 1754\nvalue = [1285.331, 45.543]\nclass = y[0]'),
 Text(0.06606388698925124, 0.6527777777777778, 'node #426\narrival_month <= 9.5\ngini = 0.102\nsamples = 773\nvalue = [560.655, 31.88]\nclass = y[0]'),
 Text(0.06309871213268113, 0.625, 'node #427\nno_of_children <= 0.5\ngini = 0.219\nsamples = 229\nvalue = [159.548, 22.772]\nclass = y[0]'),
 Text(0.06276460792349013, 0.5972222222222222, 'node #428\nlead_time <= 42.5\ngini = 0.207\nsamples = 228\nvalue = [159.548, 21.254]\nclass = y[0]'),
 Text(0.0610105608252374, 0.5694444444444444, 'node #429\nlead_time <= 41.5\ngini = 0.299\nsamples = 121\nvalue = [81.265, 18.217]\nclass = y[0]'),
 Text(0.0606764566160464, 0.5416666666666666, 'node #430\narrival_date <= 17.5\ngini = 0.265\nsamples = 119\nvalue = [81.265, 15.181]\nclass = y[0]'),
 Text(0.059507091883877906, 0.5138888888888888, 'node #431\narrival_date <= 9.5\ngini = 0.374\nsamples = 57\nvalue = [36.532, 12.145]\nclass = y[0]'),
 Text(0.05917298767468691, 0.4861111111111111, 'node #432\ngini = 0.0\nsamples = 14\nvalue = [10.438, 0.0]\nclass = y[0]'),
 Text(0.059841196093068906, 0.4861111111111111, 'node #433\nlead_time <= 3.5\ngini = 0.433\nsamples = 43\nvalue = [26.094, 12.145]\nclass = y[0]'),
 Text(0.059507091883877906, 0.4583333333333333, 'node #434\ngini = 0.0\nsamples = 11\nvalue = [8.201, 0.0]\nclass = y[0]'),
 Text(0.0601753003022599, 0.4583333333333333, 'node #435\nlead_time <= 4.5\ngini = 0.482\nsamples = 32\nvalue = [17.893, 12.145]\nclass = y[0]'),
 Text(0.059841196093068906, 0.4305555555555556, 'node #436\ngini = 0.0\nsamples = 3\nvalue = [0.0, 4.554]\nclass = y[1]'),
 Text(0.0605094045114509, 0.4305555555555556, 'node #437\navg_price_per_room <= 66.525\ngini = 0.418\nsamples = 29\nvalue = [17.893, 7.591]\nclass = y[0]'),
 Text(0.0601753003022599, 0.4027777777777778, 'node #438\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.0608435087206419, 0.4027777777777778, 'node #439\navg_price_per_room <= 84.5\ngini = 0.378\nsamples = 28\nvalue = [17.893, 6.072]\nclass = y[0]'),
 Text(0.05934003977928241, 0.375, 'node #440\nlead_time <= 30.0\ngini = 0.2\nsamples = 17\nvalue = [11.929, 1.518]\nclass = y[0]'),
 Text(0.05900593557009141, 0.3472222222222222, 'node #441\ngini = 0.0\nsamples = 11\nvalue = [8.201, 0.0]\nclass = y[0]'),
 Text(0.059674143988473406, 0.3472222222222222, 'node #442\narrival_month <= 8.5\ngini = 0.411\nsamples = 6\nvalue = [3.728, 1.518]\nclass = y[0]'),
 Text(0.05934003977928241, 0.3194444444444444, 'node #443\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.060008248197664406, 0.3194444444444444, 'node #444\nno_of_adults <= 1.5\ngini = 0.447\nsamples = 5\nvalue = [2.982, 1.518]\nclass = y[0]'),
 Text(0.059674143988473406, 0.2916666666666667, 'node #445\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.0603423524068554, 0.2916666666666667, 'node #446\ngini = 0.482\nsamples = 4\nvalue = [2.237, 1.518]\nclass = y[0]'),
 Text(0.06234697766200139, 0.375, 'node #447\nlead_time <= 32.0\ngini = 0.491\nsamples = 11\nvalue = [5.964, 4.554]\nclass = y[0]'),
 Text(0.06167876924361939, 0.3472222222222222, 'node #448\nlead_time <= 24.5\ngini = 0.442\nsamples = 6\nvalue = [2.237, 4.554]\nclass = y[1]'),
 Text(0.0613446650344284, 0.3194444444444444, 'node #449\ntotal_nights <= 1.5\ngini = 0.482\nsamples = 4\nvalue = [2.237, 1.518]\nclass = y[0]'),
 Text(0.0610105608252374, 0.2916666666666667, 'node #450\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.06167876924361939, 0.2916666666666667, 'node #451\ngini = -0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.06201287345281039, 0.3194444444444444, 'node #452\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.06301518608038338, 0.3472222222222222, 'node #453\narrival_date <= 10.5\ngini = 0.0\nsamples = 5\nvalue = [3.728, 0.0]\nclass = y[0]'),
 Text(0.06268108187119238, 0.3194444444444444, 'node #454\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.06334929028957438, 0.3194444444444444, 'node #455\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.06184582134821489, 0.5138888888888888, 'node #456\nlead_time <= 34.5\ngini = 0.119\nsamples = 62\nvalue = [44.733, 3.036]\nclass = y[0]'),
 Text(0.0611776129298329, 0.4861111111111111, 'node #457\navg_price_per_room <= 64.6\ngini = 0.0\nsamples = 49\nvalue = [36.532, 0.0]\nclass = y[0]'),
 Text(0.0608435087206419, 0.4583333333333333, 'node #458\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.0615117171390239, 0.4583333333333333, 'node #459\ngini = 0.0\nsamples = 48\nvalue = [35.786, 0.0]\nclass = y[0]'),
 Text(0.06251402976659688, 0.4861111111111111, 'node #460\nno_of_adults <= 1.5\ngini = 0.394\nsamples = 13\nvalue = [8.201, 3.036]\nclass = y[0]'),
 Text(0.06217992555740589, 0.4583333333333333, 'node #461\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.06284813397578788, 0.4583333333333333, 'node #462\ngini = 0.0\nsamples = 11\nvalue = [8.201, 0.0]\nclass = y[0]'),
 Text(0.0613446650344284, 0.5416666666666666, 'node #463\ngini = -0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.06451865502174288, 0.5694444444444444, 'node #464\ntotal_nights <= 4.5\ngini = 0.072\nsamples = 107\nvalue = [78.283, 3.036]\nclass = y[0]'),
 Text(0.06385044660336088, 0.5416666666666666, 'node #465\narrival_date <= 26.5\ngini = 0.038\nsamples = 105\nvalue = [77.537, 1.518]\nclass = y[0]'),
 Text(0.06351634239416988, 0.5138888888888888, 'node #466\ngini = 0.0\nsamples = 87\nvalue = [64.863, 0.0]\nclass = y[0]'),
 Text(0.06418455081255188, 0.5138888888888888, 'node #467\narrival_date <= 27.5\ngini = 0.191\nsamples = 18\nvalue = [12.674, 1.518]\nclass = y[0]'),
 Text(0.06385044660336088, 0.4861111111111111, 'node #468\ntotal_nights <= 3.5\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.06351634239416988, 0.4583333333333333, 'node #469\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.06418455081255188, 0.4583333333333333, 'node #470\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.06451865502174288, 0.4861111111111111, 'node #471\ngini = 0.0\nsamples = 16\nvalue = [11.929, 0.0]\nclass = y[0]'),
 Text(0.06518686344012486, 0.5416666666666666, 'node #472\nlead_time <= 73.5\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.06485275923093388, 0.5138888888888888, 'node #473\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.06552096764931586, 0.5138888888888888, 'node #474\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.06343281634187213, 0.5972222222222222, 'node #475\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.06902906184582135, 0.625, 'node #476\nlead_time <= 60.5\ngini = 0.043\nsamples = 544\nvalue = [401.107, 9.109]\nclass = y[0]'),
 Text(0.06702443659067536, 0.5972222222222222, 'node #477\narrival_month <= 10.5\ngini = 0.025\nsamples = 480\nvalue = [355.628, 4.554]\nclass = y[0]'),
 Text(0.06669033238148436, 0.5694444444444444, 'node #478\nlead_time <= 24.0\ngini = 0.06\nsamples = 194\nvalue = [142.4, 4.554]\nclass = y[0]'),
 Text(0.06635622817229336, 0.5416666666666666, 'node #479\ngini = 0.0\nsamples = 75\nvalue = [55.916, 0.0]\nclass = y[0]'),
 Text(0.06702443659067536, 0.5416666666666666, 'node #480\narrival_date <= 10.0\ngini = 0.095\nsamples = 119\nvalue = [86.484, 4.554]\nclass = y[0]'),
 Text(0.06618917606769786, 0.5138888888888888, 'node #481\nlead_time <= 29.5\ngini = 0.323\nsamples = 18\nvalue = [11.929, 3.036]\nclass = y[0]'),
 Text(0.06585507185850686, 0.4861111111111111, 'node #482\navg_price_per_room <= 85.25\ngini = 0.5\nsamples = 6\nvalue = [2.982, 3.036]\nclass = y[1]'),
 Text(0.06518686344012486, 0.4583333333333333, 'node #483\narrival_year <= 2017.5\ngini = 0.442\nsamples = 4\nvalue = [1.491, 3.036]\nclass = y[1]'),
 Text(0.06485275923093388, 0.4305555555555556, 'node #484\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.06552096764931586, 0.4305555555555556, 'node #485\ntotal_nights <= 3.5\ngini = 0.5\nsamples = 3\nvalue = [1.491, 1.518]\nclass = y[1]'),
 Text(0.06518686344012486, 0.4027777777777778, 'node #486\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.06585507185850686, 0.4027777777777778, 'node #487\nlead_time <= 25.5\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.06552096764931586, 0.375, 'node #488\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.06618917606769786, 0.375, 'node #489\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.06652328027688886, 0.4583333333333333, 'node #490\narrival_date <= 6.0\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.06618917606769786, 0.4305555555555556, 'node #491\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.06685738448607986, 0.4305555555555556, 'node #492\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.06652328027688886, 0.4861111111111111, 'node #493\ngini = 0.0\nsamples = 12\nvalue = [8.947, 0.0]\nclass = y[0]'),
 Text(0.06785969711365286, 0.5138888888888888, 'node #494\nno_of_adults <= 1.5\ngini = 0.039\nsamples = 101\nvalue = [74.555, 1.518]\nclass = y[0]'),
 Text(0.06752559290446186, 0.4861111111111111, 'node #495\navg_price_per_room <= 75.3\ngini = 0.161\nsamples = 22\nvalue = [15.657, 1.518]\nclass = y[0]'),
 Text(0.06719148869527086, 0.4583333333333333, 'node #496\ngini = -0.0\nsamples = 17\nvalue = [12.674, 0.0]\nclass = y[0]'),
 Text(0.06785969711365286, 0.4583333333333333, 'node #497\navg_price_per_room <= 76.3\ngini = 0.447\nsamples = 5\nvalue = [2.982, 1.518]\nclass = y[0]'),
 Text(0.06752559290446186, 0.4305555555555556, 'node #498\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.06819380132284385, 0.4305555555555556, 'node #499\ngini = 0.0\nsamples = 4\nvalue = [2.982, 0.0]\nclass = y[0]'),
 Text(0.06819380132284385, 0.4861111111111111, 'node #500\ngini = 0.0\nsamples = 79\nvalue = [58.899, 0.0]\nclass = y[0]'),
 Text(0.06735854079986636, 0.5694444444444444, 'node #501\ngini = 0.0\nsamples = 286\nvalue = [213.228, 0.0]\nclass = y[0]'),
 Text(0.07103368710096733, 0.5972222222222222, 'node #502\nlead_time <= 66.5\ngini = 0.165\nsamples = 64\nvalue = [45.479, 4.554]\nclass = y[0]'),
 Text(0.07019842657798984, 0.5694444444444444, 'node #503\nlead_time <= 65.5\ngini = 0.287\nsamples = 32\nvalue = [21.621, 4.554]\nclass = y[0]'),
 Text(0.06953021815960785, 0.5416666666666666, 'node #504\navg_price_per_room <= 70.125\ngini = 0.155\nsamples = 23\nvalue = [16.402, 1.518]\nclass = y[0]'),
 Text(0.06919611395041685, 0.5138888888888888, 'node #505\nroom_type_reserved_Room_Type 4 <= 0.5\ngini = 0.411\nsamples = 6\nvalue = [3.728, 1.518]\nclass = y[0]'),
 Text(0.06886200974122585, 0.4861111111111111, 'node #506\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.06953021815960785, 0.4861111111111111, 'node #507\ngini = 0.0\nsamples = 5\nvalue = [3.728, 0.0]\nclass = y[0]'),
 Text(0.06986432236879885, 0.5138888888888888, 'node #508\ngini = -0.0\nsamples = 17\nvalue = [12.674, 0.0]\nclass = y[0]'),
 Text(0.07086663499637184, 0.5416666666666666, 'node #509\nroom_type_reserved_Room_Type 3 <= 0.5\ngini = 0.465\nsamples = 9\nvalue = [5.219, 3.036]\nclass = y[0]'),
 Text(0.07053253078718084, 0.5138888888888888, 'node #510\narrival_year <= 2017.5\ngini = 0.482\nsamples = 8\nvalue = [4.473, 3.036]\nclass = y[0]'),
 Text(0.07019842657798984, 0.4861111111111111, 'node #511\ngini = 0.495\nsamples = 7\nvalue = [3.728, 3.036]\nclass = y[0]'),
 Text(0.07086663499637184, 0.4861111111111111, 'node #512\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.07120073920556283, 0.5138888888888888, 'node #513\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.07186894762394483, 0.5694444444444444, 'node #514\narrival_month <= 10.5\ngini = 0.0\nsamples = 32\nvalue = [23.858, 0.0]\nclass = y[0]'),
 Text(0.07153484341475383, 0.5416666666666666, 'node #515\ngini = 0.0\nsamples = 5\nvalue = [3.728, 0.0]\nclass = y[0]'),
 Text(0.07220305183313583, 0.5416666666666666, 'node #516\ngini = 0.0\nsamples = 27\nvalue = [20.13, 0.0]\nclass = y[0]'),
 Text(0.07462530734977056, 0.6527777777777778, 'node #517\nlead_time <= 27.5\ngini = 0.036\nsamples = 981\nvalue = [724.676, 13.663]\nclass = y[0]'),
 Text(0.07379004682679306, 0.625, 'node #518\narrival_year <= 2017.5\ngini = 0.093\nsamples = 325\nvalue = [236.34, 12.145]\nclass = y[0]'),
 Text(0.07312183840841108, 0.5972222222222222, 'node #519\navg_price_per_room <= 158.835\ngini = 0.023\nsamples = 174\nvalue = [128.98, 1.518]\nclass = y[0]'),
 Text(0.07278773419922008, 0.5694444444444444, 'node #520\ngini = 0.0\nsamples = 173\nvalue = [128.98, 0.0]\nclass = y[0]'),
 Text(0.07345594261760208, 0.5694444444444444, 'node #521\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.07445825524517506, 0.5972222222222222, 'node #522\navg_price_per_room <= 118.575\ngini = 0.164\nsamples = 151\nvalue = [107.359, 10.627]\nclass = y[0]'),
 Text(0.07412415103598406, 0.5694444444444444, 'node #523\navg_price_per_room <= 117.075\ngini = 0.271\nsamples = 81\nvalue = [55.171, 10.627]\nclass = y[0]'),
 Text(0.07379004682679306, 0.5416666666666666, 'node #524\ntotal_nights <= 3.5\ngini = 0.213\nsamples = 79\nvalue = [55.171, 7.591]\nclass = y[0]'),
 Text(0.07220305183313583, 0.5138888888888888, 'node #525\narrival_date <= 4.5\ngini = 0.112\nsamples = 66\nvalue = [47.715, 3.036]\nclass = y[0]'),
 Text(0.07153484341475383, 0.4861111111111111, 'node #526\navg_price_per_room <= 100.0\ngini = 0.5\nsamples = 3\nvalue = [1.491, 1.518]\nclass = y[1]'),
 Text(0.07120073920556283, 0.4583333333333333, 'node #527\ngini = -0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.07186894762394483, 0.4583333333333333, 'node #528\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.07287126025151783, 0.4861111111111111, 'node #529\narrival_date <= 25.5\ngini = 0.062\nsamples = 63\nvalue = [46.224, 1.518]\nclass = y[0]'),
 Text(0.07253715604232683, 0.4583333333333333, 'node #530\ngini = 0.0\nsamples = 52\nvalue = [38.769, 0.0]\nclass = y[0]'),
 Text(0.07320536446070883, 0.4583333333333333, 'node #531\nno_of_adults <= 1.5\ngini = 0.281\nsamples = 11\nvalue = [7.456, 1.518]\nclass = y[0]'),
 Text(0.07253715604232683, 0.4305555555555556, 'node #532\narrival_date <= 26.5\ngini = 0.5\nsamples = 3\nvalue = [1.491, 1.518]\nclass = y[1]'),
 Text(0.07220305183313583, 0.4027777777777778, 'node #533\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.07287126025151783, 0.4027777777777778, 'node #534\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.07387357287909081, 0.4305555555555556, 'node #535\narrival_month <= 6.5\ngini = 0.0\nsamples = 8\nvalue = [5.964, 0.0]\nclass = y[0]'),
 Text(0.07353946866989983, 0.4027777777777778, 'node #536\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.07420767708828181, 0.4027777777777778, 'node #537\ngini = 0.0\nsamples = 6\nvalue = [4.473, 0.0]\nclass = y[0]'),
 Text(0.07537704182045031, 0.5138888888888888, 'node #538\navg_price_per_room <= 92.5\ngini = 0.471\nsamples = 13\nvalue = [7.456, 4.554]\nclass = y[0]'),
 Text(0.07504293761125931, 0.4861111111111111, 'node #539\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.07571114602964131, 0.4861111111111111, 'node #540\nlead_time <= 1.0\ngini = 0.411\nsamples = 12\nvalue = [7.456, 3.036]\nclass = y[0]'),
 Text(0.07487588550666381, 0.4583333333333333, 'node #541\narrival_date <= 28.0\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.07454178129747281, 0.4305555555555556, 'node #542\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.07520998971585481, 0.4305555555555556, 'node #543\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.07654640655261881, 0.4583333333333333, 'node #544\nlead_time <= 13.5\ngini = 0.301\nsamples = 10\nvalue = [6.71, 1.518]\nclass = y[0]'),
 Text(0.07587819813423681, 0.4305555555555556, 'node #545\narrival_date <= 4.5\ngini = 0.0\nsamples = 7\nvalue = [5.219, 0.0]\nclass = y[0]'),
 Text(0.07554409392504581, 0.4027777777777778, 'node #546\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.07621230234342781, 0.4027777777777778, 'node #547\ngini = 0.0\nsamples = 6\nvalue = [4.473, 0.0]\nclass = y[0]'),
 Text(0.0772146149710008, 0.4305555555555556, 'node #548\nlead_time <= 18.0\ngini = 0.5\nsamples = 3\nvalue = [1.491, 1.518]\nclass = y[1]'),
 Text(0.0768805107618098, 0.4027777777777778, 'node #549\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.0775487191801918, 0.4027777777777778, 'node #550\ngini = -0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.07445825524517506, 0.5416666666666666, 'node #551\ngini = -0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.07479235945436606, 0.5694444444444444, 'node #552\ngini = 0.0\nsamples = 70\nvalue = [52.189, 0.0]\nclass = y[0]'),
 Text(0.07546056787274806, 0.625, 'node #553\nroom_type_reserved_Room_Type 4 <= 0.5\ngini = 0.006\nsamples = 656\nvalue = [488.336, 1.518]\nclass = y[0]'),
 Text(0.07512646366355706, 0.5972222222222222, 'node #554\ngini = 0.0\nsamples = 629\nvalue = [468.952, 0.0]\nclass = y[0]'),
 Text(0.07579467208193906, 0.5972222222222222, 'node #555\nlead_time <= 69.5\ngini = 0.135\nsamples = 27\nvalue = [19.384, 1.518]\nclass = y[0]'),
 Text(0.07546056787274806, 0.5694444444444444, 'node #556\ngini = 0.0\nsamples = 22\nvalue = [16.402, 0.0]\nclass = y[0]'),
 Text(0.07612877629113006, 0.5694444444444444, 'node #557\ntype_of_meal_plan_Meal Plan 2 <= 0.5\ngini = 0.447\nsamples = 5\nvalue = [2.982, 1.518]\nclass = y[0]'),
 Text(0.07579467208193906, 0.5416666666666666, 'node #558\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.07646288050032106, 0.5416666666666666, 'node #559\ngini = 0.0\nsamples = 4\nvalue = [2.982, 0.0]\nclass = y[0]'),
 Text(0.0797830410791566, 0.7916666666666666, 'node #560\nlead_time <= 78.5\ngini = 0.442\nsamples = 329\nvalue = [197.571, 97.159]\nclass = y[0]'),
 Text(0.07579467208193906, 0.7638888888888888, 'node #561\navg_price_per_room <= 79.78\ngini = 0.42\nsamples = 73\nvalue = [25.349, 59.207]\nclass = y[1]'),
 Text(0.07512646366355706, 0.7361111111111112, 'node #562\narrival_month <= 3.5\ngini = 0.168\nsamples = 21\nvalue = [14.911, 1.518]\nclass = y[0]'),
 Text(0.07479235945436606, 0.7083333333333334, 'node #563\nno_of_adults <= 1.5\ngini = 0.482\nsamples = 4\nvalue = [2.237, 1.518]\nclass = y[0]'),
 Text(0.07445825524517506, 0.6805555555555556, 'node #564\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.07512646366355706, 0.6805555555555556, 'node #565\ngini = -0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.07546056787274806, 0.7083333333333334, 'node #566\ngini = -0.0\nsamples = 17\nvalue = [12.674, 0.0]\nclass = y[0]'),
 Text(0.07646288050032106, 0.7361111111111112, 'node #567\narrival_month <= 3.5\ngini = 0.259\nsamples = 52\nvalue = [10.438, 57.688]\nclass = y[1]'),
 Text(0.07612877629113006, 0.7083333333333334, 'node #568\ngini = 0.0\nsamples = 19\nvalue = [0.0, 28.844]\nclass = y[1]'),
 Text(0.07679698470951205, 0.7083333333333334, 'node #569\ntotal_nights <= 2.5\ngini = 0.39\nsamples = 33\nvalue = [10.438, 28.844]\nclass = y[1]'),
 Text(0.07612877629113006, 0.6805555555555556, 'node #570\navg_price_per_room <= 96.215\ngini = 0.134\nsamples = 22\nvalue = [2.237, 28.844]\nclass = y[1]'),
 Text(0.07579467208193906, 0.6527777777777778, 'node #571\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.07646288050032106, 0.6527777777777778, 'node #572\narrival_month <= 11.0\ngini = 0.049\nsamples = 20\nvalue = [0.746, 28.844]\nclass = y[1]'),
 Text(0.07612877629113006, 0.625, 'node #573\ngini = 0.0\nsamples = 19\nvalue = [0.0, 28.844]\nclass = y[1]'),
 Text(0.07679698470951205, 0.625, 'node #574\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.07746519312789404, 0.6805555555555556, 'node #575\narrival_year <= 2017.5\ngini = 0.0\nsamples = 11\nvalue = [8.201, 0.0]\nclass = y[0]'),
 Text(0.07713108891870304, 0.6527777777777778, 'node #576\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.07779929733708504, 0.6527777777777778, 'node #577\ngini = 0.0\nsamples = 10\nvalue = [7.456, 0.0]\nclass = y[0]'),
 Text(0.08377141007637413, 0.7638888888888888, 'node #578\ntotal_nights <= 3.5\ngini = 0.296\nsamples = 256\nvalue = [172.222, 37.953]\nclass = y[0]'),
 Text(0.08047213101061303, 0.7361111111111112, 'node #579\nmarket_segment_type_Corporate <= 0.5\ngini = 0.203\nsamples = 218\nvalue = [152.838, 19.736]\nclass = y[0]'),
 Text(0.07880160996465804, 0.7083333333333334, 'node #580\ntotal_nights <= 2.5\ngini = 0.137\nsamples = 186\nvalue = [133.454, 10.627]\nclass = y[0]'),
 Text(0.07846750575546704, 0.6805555555555556, 'node #581\ngini = -0.0\nsamples = 110\nvalue = [82.011, 0.0]\nclass = y[0]'),
 Text(0.07913571417384904, 0.6805555555555556, 'node #582\navg_price_per_room <= 98.45\ngini = 0.284\nsamples = 76\nvalue = [51.443, 10.627]\nclass = y[0]'),
 Text(0.07846750575546704, 0.6527777777777778, 'node #583\nlead_time <= 80.5\ngini = 0.064\nsamples = 60\nvalue = [43.988, 1.518]\nclass = y[0]'),
 Text(0.07813340154627604, 0.625, 'node #584\narrival_date <= 25.0\ngini = 0.349\nsamples = 8\nvalue = [5.219, 1.518]\nclass = y[0]'),
 Text(0.07779929733708504, 0.5972222222222222, 'node #585\ngini = -0.0\nsamples = 5\nvalue = [3.728, 0.0]\nclass = y[0]'),
 Text(0.07846750575546704, 0.5972222222222222, 'node #586\ntype_of_meal_plan_Meal Plan 2 <= 0.5\ngini = 0.5\nsamples = 3\nvalue = [1.491, 1.518]\nclass = y[1]'),
 Text(0.07813340154627604, 0.5694444444444444, 'node #587\narrival_date <= 26.5\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.07779929733708504, 0.5416666666666666, 'node #588\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.07846750575546704, 0.5416666666666666, 'node #589\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.07880160996465804, 0.5694444444444444, 'node #590\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.07880160996465804, 0.625, 'node #591\ngini = -0.0\nsamples = 52\nvalue = [38.769, 0.0]\nclass = y[0]'),
 Text(0.07980392259223103, 0.6527777777777778, 'node #592\narrival_month <= 5.5\ngini = 0.495\nsamples = 16\nvalue = [7.456, 9.109]\nclass = y[1]'),
 Text(0.07946981838304004, 0.625, 'node #593\ngini = 0.0\nsamples = 4\nvalue = [0.0, 6.072]\nclass = y[1]'),
 Text(0.08013802680142203, 0.625, 'node #594\nlead_time <= 88.0\ngini = 0.411\nsamples = 12\nvalue = [7.456, 3.036]\nclass = y[0]'),
 Text(0.07980392259223103, 0.5972222222222222, 'node #595\narrival_year <= 2017.5\ngini = 0.0\nsamples = 10\nvalue = [7.456, 0.0]\nclass = y[0]'),
 Text(0.07946981838304004, 0.5694444444444444, 'node #596\ngini = 0.0\nsamples = 5\nvalue = [3.728, 0.0]\nclass = y[0]'),
 Text(0.08013802680142203, 0.5694444444444444, 'node #597\ngini = 0.0\nsamples = 5\nvalue = [3.728, 0.0]\nclass = y[0]'),
 Text(0.08047213101061303, 0.5972222222222222, 'node #598\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.08214265205656802, 0.7083333333333334, 'node #599\nlead_time <= 86.5\ngini = 0.435\nsamples = 32\nvalue = [19.384, 9.109]\nclass = y[0]'),
 Text(0.08147444363818603, 0.6805555555555556, 'node #600\navg_price_per_room <= 97.5\ngini = 0.332\nsamples = 26\nvalue = [17.148, 4.554]\nclass = y[0]'),
 Text(0.08114033942899503, 0.6527777777777778, 'node #601\ngini = 0.0\nsamples = 10\nvalue = [7.456, 0.0]\nclass = y[0]'),
 Text(0.08180854784737703, 0.6527777777777778, 'node #602\ntotal_nights <= 2.5\ngini = 0.435\nsamples = 16\nvalue = [9.692, 4.554]\nclass = y[0]'),
 Text(0.08147444363818603, 0.625, 'node #603\navg_price_per_room <= 114.865\ngini = 0.459\nsamples = 14\nvalue = [8.201, 4.554]\nclass = y[0]'),
 Text(0.08114033942899503, 0.5972222222222222, 'node #604\ngini = 0.471\nsamples = 13\nvalue = [7.456, 4.554]\nclass = y[0]'),
 Text(0.08180854784737703, 0.5972222222222222, 'node #605\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.08214265205656802, 0.625, 'node #606\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.08281086047495001, 0.6805555555555556, 'node #607\nlead_time <= 87.5\ngini = 0.442\nsamples = 6\nvalue = [2.237, 4.554]\nclass = y[1]'),
 Text(0.08247675626575901, 0.6527777777777778, 'node #608\ngini = 0.0\nsamples = 3\nvalue = [0.0, 4.554]\nclass = y[1]'),
 Text(0.08314496468414101, 0.6527777777777778, 'node #609\ntotal_nights <= 2.5\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.08281086047495001, 0.625, 'node #610\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.08347906889333201, 0.625, 'node #611\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.08707068914213524, 0.7361111111111112, 'node #612\narrival_date <= 24.5\ngini = 0.5\nsamples = 38\nvalue = [19.384, 18.217]\nclass = y[0]'),
 Text(0.08531664204388249, 0.7083333333333334, 'node #613\narrival_date <= 8.5\ngini = 0.386\nsamples = 27\nvalue = [17.148, 6.072]\nclass = y[0]'),
 Text(0.08414727731171401, 0.6805555555555556, 'node #614\nlead_time <= 84.0\ngini = 0.5\nsamples = 9\nvalue = [4.473, 4.554]\nclass = y[1]'),
 Text(0.08381317310252301, 0.6527777777777778, 'node #615\ngini = 0.0\nsamples = 4\nvalue = [2.982, 0.0]\nclass = y[0]'),
 Text(0.08448138152090501, 0.6527777777777778, 'node #616\navg_price_per_room <= 84.025\ngini = 0.372\nsamples = 5\nvalue = [1.491, 4.554]\nclass = y[1]'),
 Text(0.08414727731171401, 0.625, 'node #617\ngini = -0.0\nsamples = 3\nvalue = [0.0, 4.554]\nclass = y[1]'),
 Text(0.08481548573009601, 0.625, 'node #618\narrival_month <= 6.0\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.08448138152090501, 0.5972222222222222, 'node #619\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.08514958993928701, 0.5972222222222222, 'node #620\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.08648600677605099, 0.6805555555555556, 'node #621\narrival_date <= 21.5\ngini = 0.191\nsamples = 18\nvalue = [12.674, 1.518]\nclass = y[0]'),
 Text(0.08581779835766899, 0.6527777777777778, 'node #622\nno_of_adults <= 2.5\ngini = 0.0\nsamples = 13\nvalue = [9.692, 0.0]\nclass = y[0]'),
 Text(0.08548369414847799, 0.625, 'node #623\ngini = 0.0\nsamples = 11\nvalue = [8.201, 0.0]\nclass = y[0]'),
 Text(0.08615190256685999, 0.625, 'node #624\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.08715421519443299, 0.6527777777777778, 'node #625\navg_price_per_room <= 87.875\ngini = 0.447\nsamples = 5\nvalue = [2.982, 1.518]\nclass = y[0]'),
 Text(0.08682011098524199, 0.625, 'node #626\ngini = 0.0\nsamples = 4\nvalue = [2.982, 0.0]\nclass = y[0]'),
 Text(0.08748831940362399, 0.625, 'node #627\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.08882473624038797, 0.7083333333333334, 'node #628\narrival_month <= 3.5\ngini = 0.263\nsamples = 11\nvalue = [2.237, 12.145]\nclass = y[1]'),
 Text(0.08815652782200599, 0.6805555555555556, 'node #629\navg_price_per_room <= 67.25\ngini = 0.109\nsamples = 9\nvalue = [0.746, 12.145]\nclass = y[1]'),
 Text(0.08782242361281499, 0.6527777777777778, 'node #630\ngini = 0.0\nsamples = 7\nvalue = [0.0, 10.627]\nclass = y[1]'),
 Text(0.08849063203119698, 0.6527777777777778, 'node #631\nlead_time <= 80.5\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.08815652782200599, 0.625, 'node #632\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.08882473624038797, 0.625, 'node #633\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.08949294465876997, 0.6805555555555556, 'node #634\ntype_of_meal_plan_Meal Plan 2 <= 0.5\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.08915884044957897, 0.6527777777777778, 'node #635\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.08982704886796097, 0.6527777777777778, 'node #636\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.08443961849475613, 0.8194444444444444, 'node #637\narrival_date <= 28.0\ngini = 0.103\nsamples = 19\nvalue = [1.491, 25.808]\nclass = y[1]'),
 Text(0.08410551428556513, 0.7916666666666666, 'node #638\ngini = 0.0\nsamples = 17\nvalue = [0.0, 25.808]\nclass = y[1]'),
 Text(0.08477372270394713, 0.7916666666666666, 'node #639\navg_price_per_room <= 240.375\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.08443961849475613, 0.7638888888888888, 'node #640\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.08510782691313813, 0.7638888888888888, 'node #641\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.09533976831961244, 0.8472222222222222, 'node #642\navg_price_per_room <= 92.8\ngini = 0.498\nsamples = 179\nvalue = [85.738, 97.159]\nclass = y[1]'),
 Text(0.09316809095987096, 0.8194444444444444, 'node #643\narrival_date <= 22.5\ngini = 0.256\nsamples = 100\nvalue = [68.591, 12.145]\nclass = y[0]'),
 Text(0.09283398675067996, 0.7916666666666666, 'node #644\nroom_type_reserved_Room_Type 5 <= 0.5\ngini = 0.336\nsamples = 68\nvalue = [44.733, 12.145]\nclass = y[0]'),
 Text(0.09249988254148896, 0.7638888888888888, 'node #645\nlead_time <= 72.5\ngini = 0.31\nsamples = 67\nvalue = [44.733, 10.627]\nclass = y[0]'),
 Text(0.09216577833229796, 0.7361111111111112, 'node #646\nlead_time <= 33.0\ngini = 0.387\nsamples = 47\nvalue = [29.822, 10.627]\nclass = y[0]'),
 Text(0.09082936149553397, 0.7083333333333334, 'node #647\narrival_date <= 16.0\ngini = 0.196\nsamples = 35\nvalue = [24.603, 3.036]\nclass = y[0]'),
 Text(0.09049525728634297, 0.6805555555555556, 'node #648\ngini = 0.0\nsamples = 25\nvalue = [18.639, 0.0]\nclass = y[0]'),
 Text(0.09116346570472496, 0.6805555555555556, 'node #649\narrival_month <= 8.5\ngini = 0.447\nsamples = 10\nvalue = [5.964, 3.036]\nclass = y[0]'),
 Text(0.09049525728634297, 0.6527777777777778, 'node #650\narrival_date <= 19.5\ngini = 0.317\nsamples = 3\nvalue = [0.746, 3.036]\nclass = y[1]'),
 Text(0.09016115307715197, 0.625, 'node #651\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.09082936149553397, 0.625, 'node #652\nno_of_adults <= 1.5\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.09049525728634297, 0.5972222222222222, 'node #653\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.09116346570472496, 0.5972222222222222, 'node #654\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.09183167412310696, 0.6527777777777778, 'node #655\narrival_date <= 18.5\ngini = 0.0\nsamples = 7\nvalue = [5.219, 0.0]\nclass = y[0]'),
 Text(0.09149756991391596, 0.625, 'node #656\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.09216577833229796, 0.625, 'node #657\ngini = 0.0\nsamples = 5\nvalue = [3.728, 0.0]\nclass = y[0]'),
 Text(0.09350219516906196, 0.7083333333333334, 'node #658\narrival_month <= 6.5\ngini = 0.483\nsamples = 12\nvalue = [5.219, 7.591]\nclass = y[1]'),
 Text(0.09316809095987096, 0.6805555555555556, 'node #659\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.09383629937825295, 0.6805555555555556, 'node #660\nlead_time <= 47.5\ngini = 0.405\nsamples = 9\nvalue = [2.982, 7.591]\nclass = y[1]'),
 Text(0.09316809095987096, 0.6527777777777778, 'node #661\nlead_time <= 36.0\ngini = 0.482\nsamples = 4\nvalue = [2.237, 1.518]\nclass = y[0]'),
 Text(0.09283398675067996, 0.625, 'node #662\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.09350219516906196, 0.625, 'node #663\ngini = -0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.09450450779663494, 0.6527777777777778, 'node #664\navg_price_per_room <= 82.5\ngini = 0.195\nsamples = 5\nvalue = [0.746, 6.072]\nclass = y[1]'),
 Text(0.09417040358744394, 0.625, 'node #665\ngini = -0.0\nsamples = 3\nvalue = [0.0, 4.554]\nclass = y[1]'),
 Text(0.09483861200582594, 0.625, 'node #666\narrival_date <= 5.5\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.09450450779663494, 0.5972222222222222, 'node #667\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.09517271621501694, 0.5972222222222222, 'node #668\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.09283398675067996, 0.7361111111111112, 'node #669\ngini = 0.0\nsamples = 20\nvalue = [14.911, 0.0]\nclass = y[0]'),
 Text(0.09316809095987096, 0.7638888888888888, 'node #670\ngini = -0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.09350219516906196, 0.7916666666666666, 'node #671\ngini = 0.0\nsamples = 32\nvalue = [23.858, 0.0]\nclass = y[0]'),
 Text(0.09751144567935392, 0.8194444444444444, 'node #672\narrival_month <= 8.5\ngini = 0.279\nsamples = 79\nvalue = [17.148, 85.014]\nclass = y[1]'),
 Text(0.09717734147016292, 0.7916666666666666, 'node #673\narrival_date <= 21.0\ngini = 0.184\nsamples = 69\nvalue = [9.692, 85.014]\nclass = y[1]'),
 Text(0.09684323726097192, 0.7638888888888888, 'node #674\nno_of_adults <= 1.5\ngini = 0.095\nsamples = 62\nvalue = [4.473, 85.014]\nclass = y[1]'),
 Text(0.09617502884258994, 0.7361111111111112, 'node #675\narrival_month <= 4.5\ngini = 0.019\nsamples = 53\nvalue = [0.746, 78.942]\nclass = y[1]'),
 Text(0.09584092463339894, 0.7083333333333334, 'node #676\ntotal_nights <= 13.5\ngini = 0.242\nsamples = 4\nvalue = [0.746, 4.554]\nclass = y[1]'),
 Text(0.09550682042420794, 0.6805555555555556, 'node #677\nlead_time <= 3.0\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.09517271621501694, 0.6527777777777778, 'node #678\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.09584092463339894, 0.6527777777777778, 'node #679\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.09617502884258994, 0.6805555555555556, 'node #680\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.09650913305178094, 0.7083333333333334, 'node #681\ngini = 0.0\nsamples = 49\nvalue = [0.0, 74.388]\nclass = y[1]'),
 Text(0.09751144567935392, 0.7361111111111112, 'node #682\ntotal_nights <= 6.5\ngini = 0.471\nsamples = 9\nvalue = [3.728, 6.072]\nclass = y[1]'),
 Text(0.09717734147016292, 0.7083333333333334, 'node #683\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.09784554988854492, 0.7083333333333334, 'node #684\narrival_month <= 6.5\ngini = 0.393\nsamples = 7\nvalue = [2.237, 6.072]\nclass = y[1]'),
 Text(0.09751144567935392, 0.6805555555555556, 'node #685\nlead_time <= 83.0\ngini = 0.482\nsamples = 4\nvalue = [2.237, 1.518]\nclass = y[0]'),
 Text(0.09717734147016292, 0.6527777777777778, 'node #686\ngini = -0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.09784554988854492, 0.6527777777777778, 'node #687\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.09817965409773592, 0.6805555555555556, 'node #688\ngini = 0.0\nsamples = 3\nvalue = [0.0, 4.554]\nclass = y[1]'),
 Text(0.09751144567935392, 0.7638888888888888, 'node #689\ngini = 0.0\nsamples = 7\nvalue = [5.219, 0.0]\nclass = y[0]'),
 Text(0.09784554988854492, 0.7916666666666666, 'node #690\ngini = 0.0\nsamples = 10\nvalue = [7.456, 0.0]\nclass = y[0]'),
 Text(0.13054143153212883, 0.875, 'node #691\nlead_time <= 117.5\ngini = 0.5\nsamples = 1246\nvalue = [612.844, 643.681]\nclass = y[1]'),
 Text(0.11889019978387634, 0.8472222222222222, 'node #692\navg_price_per_room <= 93.575\ngini = 0.465\nsamples = 737\nvalue = [297.475, 513.123]\nclass = y[1]'),
 Text(0.10978194479972019, 0.8194444444444444, 'node #693\navg_price_per_room <= 75.07\ngini = 0.5\nsamples = 438\nvalue = [214.719, 227.717]\nclass = y[1]'),
 Text(0.10532113156919351, 0.7916666666666666, 'node #694\narrival_month <= 7.5\ngini = 0.446\nsamples = 227\nvalue = [85.738, 170.029]\nclass = y[1]'),
 Text(0.1020218525034324, 0.7638888888888888, 'node #695\navg_price_per_room <= 58.75\ngini = 0.287\nsamples = 140\nvalue = [31.313, 148.775]\nclass = y[1]'),
 Text(0.1016877482942414, 0.7361111111111112, 'node #696\ngini = 0.0\nsamples = 14\nvalue = [10.438, 0.0]\nclass = y[0]'),
 Text(0.1023559567126234, 0.7361111111111112, 'node #697\ntotal_nights <= 3.5\ngini = 0.216\nsamples = 126\nvalue = [20.875, 148.775]\nclass = y[1]'),
 Text(0.1005183835620729, 0.7083333333333334, 'node #698\nlead_time <= 104.5\ngini = 0.11\nsamples = 98\nvalue = [8.201, 132.076]\nclass = y[1]'),
 Text(0.09918196672530892, 0.6805555555555556, 'node #699\nlead_time <= 101.5\ngini = 0.497\nsamples = 11\nvalue = [5.219, 6.072]\nclass = y[1]'),
 Text(0.09851375830692692, 0.6527777777777778, 'node #700\navg_price_per_room <= 66.875\ngini = 0.393\nsamples = 7\nvalue = [2.237, 6.072]\nclass = y[1]'),
 Text(0.09817965409773592, 0.625, 'node #701\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.09884786251611792, 0.625, 'node #702\narrival_month <= 2.5\ngini = 0.317\nsamples = 6\nvalue = [1.491, 6.072]\nclass = y[1]'),
 Text(0.09851375830692692, 0.5972222222222222, 'node #703\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.09918196672530892, 0.5972222222222222, 'node #704\navg_price_per_room <= 72.5\ngini = 0.195\nsamples = 5\nvalue = [0.746, 6.072]\nclass = y[1]'),
 Text(0.09884786251611792, 0.5694444444444444, 'node #705\ngini = -0.0\nsamples = 3\nvalue = [0.0, 4.554]\nclass = y[1]'),
 Text(0.09951607093449992, 0.5694444444444444, 'node #706\ntotal_nights <= 2.5\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.09918196672530892, 0.5416666666666666, 'node #707\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.0998501751436909, 0.5416666666666666, 'node #708\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.0998501751436909, 0.6527777777777778, 'node #709\nlead_time <= 102.5\ngini = 0.0\nsamples = 4\nvalue = [2.982, 0.0]\nclass = y[0]'),
 Text(0.09951607093449992, 0.625, 'node #710\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.1001842793528819, 0.625, 'node #711\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.1018548003988369, 0.6805555555555556, 'node #712\nlead_time <= 116.5\ngini = 0.045\nsamples = 87\nvalue = [2.982, 126.004]\nclass = y[1]'),
 Text(0.1015206961896459, 0.6527777777777778, 'node #713\nlead_time <= 112.0\ngini = 0.034\nsamples = 86\nvalue = [2.237, 126.004]\nclass = y[1]'),
 Text(0.1008524877712639, 0.625, 'node #714\navg_price_per_room <= 73.625\ngini = 0.0\nsamples = 47\nvalue = [0.0, 71.351]\nclass = y[1]'),
 Text(0.1005183835620729, 0.5972222222222222, 'node #715\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.1011865919804549, 0.5972222222222222, 'node #716\ngini = 0.0\nsamples = 46\nvalue = [0.0, 69.833]\nclass = y[1]'),
 Text(0.1021889046080279, 0.625, 'node #717\ntotal_nights <= 2.5\ngini = 0.076\nsamples = 39\nvalue = [2.237, 54.652]\nclass = y[1]'),
 Text(0.1018548003988369, 0.5972222222222222, 'node #718\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.1025230088172189, 0.5972222222222222, 'node #719\navg_price_per_room <= 73.625\ngini = 0.052\nsamples = 38\nvalue = [1.491, 54.652]\nclass = y[1]'),
 Text(0.1021889046080279, 0.5694444444444444, 'node #720\ngini = 0.0\nsamples = 25\nvalue = [0.0, 37.953]\nclass = y[1]'),
 Text(0.10285711302640989, 0.5694444444444444, 'node #721\narrival_month <= 4.5\ngini = 0.151\nsamples = 13\nvalue = [1.491, 16.699]\nclass = y[1]'),
 Text(0.1025230088172189, 0.5416666666666666, 'node #722\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.10319121723560089, 0.5416666666666666, 'node #723\ngini = -0.0\nsamples = 11\nvalue = [0.0, 16.699]\nclass = y[1]'),
 Text(0.1021889046080279, 0.6527777777777778, 'node #724\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.10419352986317389, 0.7083333333333334, 'node #725\narrival_date <= 23.5\ngini = 0.491\nsamples = 28\nvalue = [12.674, 16.699]\nclass = y[1]'),
 Text(0.10352532144479189, 0.6805555555555556, 'node #726\nlead_time <= 103.5\ngini = 0.234\nsamples = 14\nvalue = [9.692, 1.518]\nclass = y[0]'),
 Text(0.10319121723560089, 0.6527777777777778, 'node #727\narrival_month <= 6.0\ngini = 0.447\nsamples = 5\nvalue = [2.982, 1.518]\nclass = y[0]'),
 Text(0.10285711302640989, 0.625, 'node #728\ngini = 0.0\nsamples = 4\nvalue = [2.982, 0.0]\nclass = y[0]'),
 Text(0.10352532144479189, 0.625, 'node #729\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.10385942565398289, 0.6527777777777778, 'node #730\ngini = 0.0\nsamples = 9\nvalue = [6.71, 0.0]\nclass = y[0]'),
 Text(0.10486173828155589, 0.6805555555555556, 'node #731\navg_price_per_room <= 73.625\ngini = 0.274\nsamples = 14\nvalue = [2.982, 15.181]\nclass = y[1]'),
 Text(0.10452763407236489, 0.6527777777777778, 'node #732\narrival_month <= 1.5\ngini = 0.089\nsamples = 11\nvalue = [0.746, 15.181]\nclass = y[1]'),
 Text(0.10419352986317389, 0.625, 'node #733\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.10486173828155589, 0.625, 'node #734\ngini = 0.0\nsamples = 10\nvalue = [0.0, 15.181]\nclass = y[1]'),
 Text(0.10519584249074689, 0.6527777777777778, 'node #735\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.1086204106349546, 0.7638888888888888, 'node #736\narrival_date <= 29.5\ngini = 0.404\nsamples = 87\nvalue = [54.425, 21.254]\nclass = y[0]'),
 Text(0.10770162405967937, 0.7361111111111112, 'node #737\ntype_of_meal_plan_Not Selected <= 0.5\ngini = 0.232\nsamples = 71\nvalue = [49.206, 7.591]\nclass = y[0]'),
 Text(0.10686636353670187, 0.7083333333333334, 'node #738\navg_price_per_room <= 71.125\ngini = 0.161\nsamples = 66\nvalue = [46.97, 4.554]\nclass = y[0]'),
 Text(0.10619815511831987, 0.6805555555555556, 'node #739\narrival_month <= 9.5\ngini = 0.0\nsamples = 43\nvalue = [32.059, 0.0]\nclass = y[0]'),
 Text(0.10586405090912887, 0.6527777777777778, 'node #740\ngini = 0.0\nsamples = 7\nvalue = [5.219, 0.0]\nclass = y[0]'),
 Text(0.10653225932751087, 0.6527777777777778, 'node #741\ngini = 0.0\nsamples = 36\nvalue = [26.84, 0.0]\nclass = y[0]'),
 Text(0.10753457195508387, 0.6805555555555556, 'node #742\ntotal_nights <= 4.5\ngini = 0.358\nsamples = 23\nvalue = [14.911, 4.554]\nclass = y[0]'),
 Text(0.10720046774589287, 0.6527777777777778, 'node #743\narrival_year <= 2017.5\ngini = 0.281\nsamples = 22\nvalue = [14.911, 3.036]\nclass = y[0]'),
 Text(0.10686636353670187, 0.625, 'node #744\ngini = 0.0\nsamples = 10\nvalue = [7.456, 0.0]\nclass = y[0]'),
 Text(0.10753457195508387, 0.625, 'node #745\narrival_month <= 9.0\ngini = 0.411\nsamples = 12\nvalue = [7.456, 3.036]\nclass = y[0]'),
 Text(0.10720046774589287, 0.5972222222222222, 'node #746\ngini = 0.0\nsamples = 5\nvalue = [3.728, 0.0]\nclass = y[0]'),
 Text(0.10786867616427487, 0.5972222222222222, 'node #747\ngini = 0.495\nsamples = 7\nvalue = [3.728, 3.036]\nclass = y[0]'),
 Text(0.10786867616427487, 0.6527777777777778, 'node #748\ngini = -0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.10853688458265685, 0.7083333333333334, 'node #749\narrival_date <= 6.5\ngini = 0.489\nsamples = 5\nvalue = [2.237, 3.036]\nclass = y[1]'),
 Text(0.10820278037346587, 0.6805555555555556, 'node #750\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.10887098879184785, 0.6805555555555556, 'node #751\nlead_time <= 94.0\ngini = 0.317\nsamples = 3\nvalue = [0.746, 3.036]\nclass = y[1]'),
 Text(0.10853688458265685, 0.6527777777777778, 'node #752\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.10920509300103885, 0.6527777777777778, 'node #753\navg_price_per_room <= 61.625\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.10887098879184785, 0.625, 'node #754\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.10953919721022985, 0.625, 'node #755\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.10953919721022985, 0.7361111111111112, 'node #756\nlead_time <= 98.0\ngini = 0.4\nsamples = 16\nvalue = [5.219, 13.663]\nclass = y[1]'),
 Text(0.10920509300103885, 0.7083333333333334, 'node #757\ngini = 0.0\nsamples = 6\nvalue = [4.473, 0.0]\nclass = y[0]'),
 Text(0.10987330141942085, 0.7083333333333334, 'node #758\navg_price_per_room <= 63.25\ngini = 0.098\nsamples = 10\nvalue = [0.746, 13.663]\nclass = y[1]'),
 Text(0.10953919721022985, 0.6805555555555556, 'node #759\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.11020740562861185, 0.6805555555555556, 'node #760\ngini = 0.0\nsamples = 9\nvalue = [0.0, 13.663]\nclass = y[1]'),
 Text(0.11424275803024687, 0.7916666666666666, 'node #761\narrival_month <= 3.5\ngini = 0.427\nsamples = 211\nvalue = [128.98, 57.688]\nclass = y[0]'),
 Text(0.11221203088375784, 0.7638888888888888, 'node #762\navg_price_per_room <= 88.5\ngini = 0.092\nsamples = 82\nvalue = [59.644, 3.036]\nclass = y[0]'),
 Text(0.11154382246537584, 0.7361111111111112, 'node #763\ntotal_nights <= 1.5\ngini = 0.049\nsamples = 80\nvalue = [58.899, 1.518]\nclass = y[0]'),
 Text(0.11120971825618484, 0.7083333333333334, 'node #764\navg_price_per_room <= 80.5\ngini = 0.123\nsamples = 30\nvalue = [21.621, 1.518]\nclass = y[0]'),
 Text(0.11087561404699385, 0.6805555555555556, 'node #765\nno_of_adults <= 1.5\ngini = 0.378\nsamples = 7\nvalue = [4.473, 1.518]\nclass = y[0]'),
 Text(0.11054150983780285, 0.6527777777777778, 'node #766\ngini = 0.411\nsamples = 6\nvalue = [3.728, 1.518]\nclass = y[0]'),
 Text(0.11120971825618484, 0.6527777777777778, 'node #767\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.11154382246537584, 0.6805555555555556, 'node #768\ngini = 0.0\nsamples = 23\nvalue = [17.148, 0.0]\nclass = y[0]'),
 Text(0.11187792667456684, 0.7083333333333334, 'node #769\ngini = 0.0\nsamples = 50\nvalue = [37.278, 0.0]\nclass = y[0]'),
 Text(0.11288023930213983, 0.7361111111111112, 'node #770\nmarket_segment_type_Offline <= 0.5\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.11254613509294883, 0.7083333333333334, 'node #771\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.11321434351133083, 0.7083333333333334, 'node #772\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.1162734851767359, 0.7638888888888888, 'node #773\narrival_month <= 4.5\ngini = 0.493\nsamples = 129\nvalue = [69.336, 54.652]\nclass = y[0]'),
 Text(0.11421665613890382, 0.7361111111111112, 'node #774\navg_price_per_room <= 80.375\ngini = 0.151\nsamples = 13\nvalue = [1.491, 16.699]\nclass = y[1]'),
 Text(0.11388255192971283, 0.7083333333333334, 'node #775\ngini = -0.0\nsamples = 11\nvalue = [0.0, 16.699]\nclass = y[1]'),
 Text(0.11455076034809482, 0.7083333333333334, 'node #776\ntotal_nights <= 4.5\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.11421665613890382, 0.6805555555555556, 'node #777\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.11488486455728582, 0.6805555555555556, 'node #778\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.118330314214568, 0.7361111111111112, 'node #779\nno_of_adults <= 1.5\ngini = 0.46\nsamples = 116\nvalue = [67.845, 37.953]\nclass = y[0]'),
 Text(0.11622128139404982, 0.7083333333333334, 'node #780\navg_price_per_room <= 86.0\ngini = 0.462\nsamples = 28\nvalue = [11.183, 19.736]\nclass = y[1]'),
 Text(0.11555307297566782, 0.6805555555555556, 'node #781\narrival_month <= 8.0\ngini = 0.208\nsamples = 14\nvalue = [2.237, 16.699]\nclass = y[1]'),
 Text(0.11521896876647682, 0.6527777777777778, 'node #782\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.11588717718485882, 0.6527777777777778, 'node #783\navg_price_per_room <= 77.07\ngini = 0.151\nsamples = 13\nvalue = [1.491, 16.699]\nclass = y[1]'),
 Text(0.11555307297566782, 0.625, 'node #784\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.11622128139404982, 0.625, 'node #785\nlead_time <= 101.5\ngini = 0.082\nsamples = 12\nvalue = [0.746, 16.699]\nclass = y[1]'),
 Text(0.11588717718485882, 0.5972222222222222, 'node #786\narrival_month <= 9.5\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.11555307297566782, 0.5694444444444444, 'node #787\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.11622128139404982, 0.5694444444444444, 'node #788\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.11655538560324082, 0.5972222222222222, 'node #789\ngini = 0.0\nsamples = 10\nvalue = [0.0, 15.181]\nclass = y[1]'),
 Text(0.11688948981243182, 0.6805555555555556, 'node #790\narrival_date <= 9.0\ngini = 0.378\nsamples = 14\nvalue = [8.947, 3.036]\nclass = y[0]'),
 Text(0.11655538560324082, 0.6527777777777778, 'node #791\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.1172235940216228, 0.6527777777777778, 'node #792\ngini = 0.394\nsamples = 13\nvalue = [8.201, 3.036]\nclass = y[0]'),
 Text(0.12043934703508616, 0.7083333333333334, 'node #793\narrival_date <= 22.5\ngini = 0.368\nsamples = 88\nvalue = [56.662, 18.217]\nclass = y[0]'),
 Text(0.11931174532906655, 0.6805555555555556, 'node #794\nno_of_adults <= 2.5\ngini = 0.168\nsamples = 63\nvalue = [44.733, 4.554]\nclass = y[0]'),
 Text(0.1183929587537913, 0.6527777777777778, 'node #795\nroom_type_reserved_Room_Type 4 <= 0.5\ngini = 0.121\nsamples = 61\nvalue = [43.988, 3.036]\nclass = y[0]'),
 Text(0.1175576982308138, 0.625, 'node #796\narrival_month <= 5.5\ngini = 0.0\nsamples = 50\nvalue = [37.278, 0.0]\nclass = y[0]'),
 Text(0.1172235940216228, 0.5972222222222222, 'node #797\ngini = 0.0\nsamples = 22\nvalue = [16.402, 0.0]\nclass = y[0]'),
 Text(0.1178918024400048, 0.5972222222222222, 'node #798\ngini = 0.0\nsamples = 28\nvalue = [20.875, 0.0]\nclass = y[0]'),
 Text(0.1192282192767688, 0.625, 'node #799\narrival_date <= 6.5\ngini = 0.429\nsamples = 11\nvalue = [6.71, 3.036]\nclass = y[0]'),
 Text(0.1185600108583868, 0.5972222222222222, 'node #800\nlead_time <= 96.5\ngini = 0.442\nsamples = 4\nvalue = [1.491, 3.036]\nclass = y[1]'),
 Text(0.1182259066491958, 0.5694444444444444, 'node #801\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.1188941150675778, 0.5694444444444444, 'node #802\ngini = -0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.11989642769515078, 0.5972222222222222, 'node #803\ntotal_nights <= 2.5\ngini = 0.0\nsamples = 7\nvalue = [5.219, 0.0]\nclass = y[0]'),
 Text(0.1195623234859598, 0.5694444444444444, 'node #804\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.12023053190434178, 0.5694444444444444, 'node #805\ngini = 0.0\nsamples = 5\nvalue = [3.728, 0.0]\nclass = y[0]'),
 Text(0.12023053190434178, 0.6527777777777778, 'node #806\navg_price_per_room <= 89.505\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.11989642769515078, 0.625, 'node #807\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.12056463611353278, 0.625, 'node #808\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.12156694874110578, 0.6805555555555556, 'node #809\nlead_time <= 96.5\ngini = 0.498\nsamples = 25\nvalue = [11.929, 13.663]\nclass = y[1]'),
 Text(0.12123284453191478, 0.6527777777777778, 'node #810\ngini = 0.0\nsamples = 8\nvalue = [5.964, 0.0]\nclass = y[0]'),
 Text(0.12190105295029678, 0.6527777777777778, 'node #811\navg_price_per_room <= 87.375\ngini = 0.423\nsamples = 17\nvalue = [5.964, 13.663]\nclass = y[1]'),
 Text(0.12123284453191478, 0.625, 'node #812\narrival_month <= 6.5\ngini = 0.294\nsamples = 13\nvalue = [2.982, 13.663]\nclass = y[1]'),
 Text(0.12089874032272378, 0.5972222222222222, 'node #813\ngini = 0.0\nsamples = 6\nvalue = [0.0, 9.109]\nclass = y[1]'),
 Text(0.12156694874110578, 0.5972222222222222, 'node #814\narrival_month <= 9.5\ngini = 0.478\nsamples = 7\nvalue = [2.982, 4.554]\nclass = y[1]'),
 Text(0.12123284453191478, 0.5694444444444444, 'node #815\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.12190105295029678, 0.5694444444444444, 'node #816\ntotal_nights <= 3.5\ngini = 0.242\nsamples = 4\nvalue = [0.746, 4.554]\nclass = y[1]'),
 Text(0.12156694874110578, 0.5416666666666666, 'node #817\narrival_date <= 26.5\ngini = 0.0\nsamples = 3\nvalue = [0.0, 4.554]\nclass = y[1]'),
 Text(0.12123284453191478, 0.5138888888888888, 'node #818\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.12190105295029678, 0.5138888888888888, 'node #819\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.12223515715948778, 0.5416666666666666, 'node #820\ngini = -0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.12256926136867878, 0.625, 'node #821\narrival_date <= 23.5\ngini = 0.0\nsamples = 4\nvalue = [2.982, 0.0]\nclass = y[0]'),
 Text(0.12223515715948778, 0.5972222222222222, 'node #822\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.12290336557786977, 0.5972222222222222, 'node #823\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.12799845476803248, 0.8194444444444444, 'node #824\narrival_date <= 11.5\ngini = 0.349\nsamples = 299\nvalue = [82.756, 285.406]\nclass = y[1]'),
 Text(0.12490799083301576, 0.7916666666666666, 'node #825\narrival_month <= 7.5\ngini = 0.494\nsamples = 79\nvalue = [36.532, 45.543]\nclass = y[1]'),
 Text(0.12357157399625177, 0.7638888888888888, 'node #826\nlead_time <= 108.5\ngini = 0.469\nsamples = 44\nvalue = [25.349, 15.181]\nclass = y[0]'),
 Text(0.12290336557786977, 0.7361111111111112, 'node #827\nno_of_adults <= 1.5\ngini = 0.499\nsamples = 26\nvalue = [12.674, 13.663]\nclass = y[1]'),
 Text(0.12256926136867878, 0.7083333333333334, 'node #828\ngini = 0.0\nsamples = 4\nvalue = [2.982, 0.0]\nclass = y[0]'),
 Text(0.12323746978706077, 0.7083333333333334, 'node #829\nlead_time <= 102.0\ngini = 0.486\nsamples = 22\nvalue = [9.692, 13.663]\nclass = y[1]'),
 Text(0.12290336557786977, 0.6805555555555556, 'node #830\ntotal_nights <= 3.5\ngini = 0.5\nsamples = 19\nvalue = [9.692, 9.109]\nclass = y[0]'),
 Text(0.12256926136867878, 0.6527777777777778, 'node #831\ngini = 0.499\nsamples = 17\nvalue = [8.201, 9.109]\nclass = y[1]'),
 Text(0.12323746978706077, 0.6527777777777778, 'node #832\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.12357157399625177, 0.6805555555555556, 'node #833\ngini = -0.0\nsamples = 3\nvalue = [0.0, 4.554]\nclass = y[1]'),
 Text(0.12423978241463376, 0.7361111111111112, 'node #834\ntotal_nights <= 2.5\ngini = 0.191\nsamples = 18\nvalue = [12.674, 1.518]\nclass = y[0]'),
 Text(0.12390567820544277, 0.7083333333333334, 'node #835\ngini = 0.248\nsamples = 13\nvalue = [8.947, 1.518]\nclass = y[0]'),
 Text(0.12457388662382476, 0.7083333333333334, 'node #836\ngini = 0.0\nsamples = 5\nvalue = [3.728, 0.0]\nclass = y[0]'),
 Text(0.12624440766977976, 0.7638888888888888, 'node #837\nlead_time <= 110.5\ngini = 0.393\nsamples = 35\nvalue = [11.183, 30.362]\nclass = y[1]'),
 Text(0.12557619925139776, 0.7361111111111112, 'node #838\navg_price_per_room <= 116.75\ngini = 0.378\nsamples = 7\nvalue = [4.473, 1.518]\nclass = y[0]'),
 Text(0.12524209504220676, 0.7083333333333334, 'node #839\ngini = 0.0\nsamples = 4\nvalue = [2.982, 0.0]\nclass = y[0]'),
 Text(0.12591030346058876, 0.7083333333333334, 'node #840\ngini = 0.5\nsamples = 3\nvalue = [1.491, 1.518]\nclass = y[1]'),
 Text(0.12691261608816176, 0.7361111111111112, 'node #841\ntotal_nights <= 2.0\ngini = 0.306\nsamples = 28\nvalue = [6.71, 28.844]\nclass = y[1]'),
 Text(0.12657851187897076, 0.7083333333333334, 'node #842\ngini = 0.0\nsamples = 8\nvalue = [0.0, 12.145]\nclass = y[1]'),
 Text(0.12724672029735273, 0.7083333333333334, 'node #843\nlead_time <= 112.0\ngini = 0.409\nsamples = 20\nvalue = [6.71, 16.699]\nclass = y[1]'),
 Text(0.12691261608816176, 0.6805555555555556, 'node #844\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.12758082450654373, 0.6805555555555556, 'node #845\navg_price_per_room <= 128.65\ngini = 0.363\nsamples = 18\nvalue = [5.219, 16.699]\nclass = y[1]'),
 Text(0.12724672029735273, 0.6527777777777778, 'node #846\narrival_year <= 2017.5\ngini = 0.333\nsamples = 17\nvalue = [4.473, 16.699]\nclass = y[1]'),
 Text(0.12691261608816176, 0.625, 'node #847\ngini = 0.393\nsamples = 14\nvalue = [4.473, 12.145]\nclass = y[1]'),
 Text(0.12758082450654373, 0.625, 'node #848\ngini = -0.0\nsamples = 3\nvalue = [0.0, 4.554]\nclass = y[1]'),
 Text(0.12791492871573473, 0.6527777777777778, 'node #849\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.13108891870304923, 0.7916666666666666, 'node #850\navg_price_per_room <= 102.09\ngini = 0.271\nsamples = 220\nvalue = [46.224, 239.862]\nclass = y[1]'),
 Text(0.12891724134330773, 0.7638888888888888, 'node #851\narrival_date <= 14.5\ngini = 0.067\nsamples = 102\nvalue = [5.219, 144.221]\nclass = y[1]'),
 Text(0.12858313713411673, 0.7361111111111112, 'node #852\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.12925134555249873, 0.7361111111111112, 'node #853\narrival_month <= 2.5\ngini = 0.049\nsamples = 100\nvalue = [3.728, 144.221]\nclass = y[1]'),
 Text(0.12891724134330773, 0.7083333333333334, 'node #854\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.12958544976168973, 0.7083333333333334, 'node #855\navg_price_per_room <= 95.44\ngini = 0.04\nsamples = 99\nvalue = [2.982, 144.221]\nclass = y[1]'),
 Text(0.12891724134330773, 0.6805555555555556, 'node #856\narrival_month <= 6.0\ngini = 0.163\nsamples = 24\nvalue = [2.982, 30.362]\nclass = y[1]'),
 Text(0.12858313713411673, 0.6527777777777778, 'node #857\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.12925134555249873, 0.6527777777777778, 'node #858\nlead_time <= 106.5\ngini = 0.128\nsamples = 23\nvalue = [2.237, 30.362]\nclass = y[1]'),
 Text(0.12891724134330773, 0.625, 'node #859\narrival_year <= 2017.5\ngini = 0.082\nsamples = 12\nvalue = [0.746, 16.699]\nclass = y[1]'),
 Text(0.12858313713411673, 0.5972222222222222, 'node #860\ngini = 0.089\nsamples = 11\nvalue = [0.746, 15.181]\nclass = y[1]'),
 Text(0.12925134555249873, 0.5972222222222222, 'node #861\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.12958544976168973, 0.625, 'node #862\ngini = 0.177\nsamples = 11\nvalue = [1.491, 13.663]\nclass = y[1]'),
 Text(0.13025365818007173, 0.6805555555555556, 'node #863\ntotal_nights <= 2.5\ngini = 0.0\nsamples = 75\nvalue = [0.0, 113.859]\nclass = y[1]'),
 Text(0.12991955397088073, 0.6527777777777778, 'node #864\ngini = 0.0\nsamples = 38\nvalue = [0.0, 57.688]\nclass = y[1]'),
 Text(0.13058776238926273, 0.6527777777777778, 'node #865\ngini = 0.0\nsamples = 37\nvalue = [0.0, 56.17]\nclass = y[1]'),
 Text(0.1332605960627907, 0.7638888888888888, 'node #866\navg_price_per_room <= 109.5\ngini = 0.42\nsamples = 118\nvalue = [41.005, 95.641]\nclass = y[1]'),
 Text(0.13192417922602673, 0.7361111111111112, 'node #867\ntotal_nights <= 1.5\ngini = 0.44\nsamples = 57\nvalue = [34.295, 16.699]\nclass = y[0]'),
 Text(0.13125597080764473, 0.7083333333333334, 'node #868\navg_price_per_room <= 108.5\ngini = 0.082\nsamples = 12\nvalue = [0.746, 16.699]\nclass = y[1]'),
 Text(0.13092186659845373, 0.6805555555555556, 'node #869\ngini = 0.0\nsamples = 11\nvalue = [0.0, 16.699]\nclass = y[1]'),
 Text(0.13159007501683573, 0.6805555555555556, 'node #870\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.13259238764440873, 0.7083333333333334, 'node #871\narrival_month <= 6.0\ngini = 0.0\nsamples = 45\nvalue = [33.55, 0.0]\nclass = y[0]'),
 Text(0.13225828343521773, 0.6805555555555556, 'node #872\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.1329264918535997, 0.6805555555555556, 'node #873\ngini = 0.0\nsamples = 42\nvalue = [31.313, 0.0]\nclass = y[0]'),
 Text(0.1345970128995547, 0.7361111111111112, 'node #874\navg_price_per_room <= 124.25\ngini = 0.144\nsamples = 61\nvalue = [6.71, 78.942]\nclass = y[1]'),
 Text(0.1339288044811727, 0.7083333333333334, 'node #875\narrival_date <= 19.5\ngini = 0.073\nsamples = 54\nvalue = [2.982, 75.906]\nclass = y[1]'),
 Text(0.1335947002719817, 0.6805555555555556, 'node #876\ngini = 0.0\nsamples = 47\nvalue = [0.0, 71.351]\nclass = y[1]'),
 Text(0.1342629086903637, 0.6805555555555556, 'node #877\navg_price_per_room <= 114.58\ngini = 0.478\nsamples = 7\nvalue = [2.982, 4.554]\nclass = y[1]'),
 Text(0.1339288044811727, 0.6527777777777778, 'node #878\ngini = 0.0\nsamples = 4\nvalue = [2.982, 0.0]\nclass = y[0]'),
 Text(0.1345970128995547, 0.6527777777777778, 'node #879\ngini = 0.0\nsamples = 3\nvalue = [0.0, 4.554]\nclass = y[1]'),
 Text(0.1352652213179367, 0.7083333333333334, 'node #880\narrival_date <= 27.5\ngini = 0.495\nsamples = 7\nvalue = [3.728, 3.036]\nclass = y[0]'),
 Text(0.1349311171087457, 0.6805555555555556, 'node #881\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.1355993255271277, 0.6805555555555556, 'node #882\ntotal_nights <= 3.5\ngini = 0.442\nsamples = 4\nvalue = [1.491, 3.036]\nclass = y[1]'),
 Text(0.1352652213179367, 0.6527777777777778, 'node #883\narrival_year <= 2017.5\ngini = 0.317\nsamples = 3\nvalue = [0.746, 3.036]\nclass = y[1]'),
 Text(0.1349311171087457, 0.625, 'node #884\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.1355993255271277, 0.625, 'node #885\navg_price_per_room <= 177.835\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.1352652213179367, 0.5972222222222222, 'node #886\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.1359334297363187, 0.5972222222222222, 'node #887\ngini = -0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.1359334297363187, 0.6527777777777778, 'node #888\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.1421926632803813, 0.8472222222222222, 'node #889\nno_of_adults <= 1.5\ngini = 0.414\nsamples = 509\nvalue = [315.368, 130.558]\nclass = y[0]'),
 Text(0.1415244548619993, 0.8194444444444444, 'node #890\navg_price_per_room <= 122.0\ngini = 0.055\nsamples = 143\nvalue = [105.123, 3.036]\nclass = y[0]'),
 Text(0.1411903506528083, 0.7916666666666666, 'node #891\ngini = 0.0\nsamples = 141\nvalue = [105.123, 0.0]\nclass = y[0]'),
 Text(0.1418585590711903, 0.7916666666666666, 'node #892\ngini = -0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.1428608716987633, 0.8194444444444444, 'node #893\narrival_month <= 11.5\ngini = 0.47\nsamples = 366\nvalue = [210.246, 127.522]\nclass = y[0]'),
 Text(0.1425267674895723, 0.7916666666666666, 'node #894\narrival_date <= 7.5\ngini = 0.493\nsamples = 301\nvalue = [161.785, 127.522]\nclass = y[0]'),
 Text(0.1376039507822737, 0.7638888888888888, 'node #895\nlead_time <= 150.5\ngini = 0.177\nsamples = 59\nvalue = [41.751, 4.554]\nclass = y[0]'),
 Text(0.1372698465730827, 0.7361111111111112, 'node #896\narrival_month <= 5.0\ngini = 0.126\nsamples = 58\nvalue = [41.751, 3.036]\nclass = y[0]'),
 Text(0.1369357423638917, 0.7083333333333334, 'node #897\ngini = 0.0\nsamples = 33\nvalue = [24.603, 0.0]\nclass = y[0]'),
 Text(0.1376039507822737, 0.7083333333333334, 'node #898\narrival_month <= 6.5\ngini = 0.256\nsamples = 25\nvalue = [17.148, 3.036]\nclass = y[0]'),
 Text(0.1372698465730827, 0.6805555555555556, 'node #899\navg_price_per_room <= 107.5\ngini = 0.5\nsamples = 6\nvalue = [2.982, 3.036]\nclass = y[1]'),
 Text(0.1366016381547007, 0.6527777777777778, 'node #900\nlead_time <= 133.5\ngini = 0.317\nsamples = 3\nvalue = [0.746, 3.036]\nclass = y[1]'),
 Text(0.1362675339455097, 0.625, 'node #901\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.1369357423638917, 0.625, 'node #902\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.1379380549914647, 0.6527777777777778, 'node #903\nlead_time <= 130.0\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.1376039507822737, 0.625, 'node #904\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.1382721592006557, 0.625, 'node #905\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.1379380549914647, 0.6805555555555556, 'node #906\ngini = 0.0\nsamples = 19\nvalue = [14.165, 0.0]\nclass = y[0]'),
 Text(0.1379380549914647, 0.7361111111111112, 'node #907\ngini = -0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.1474495841968709, 0.7638888888888888, 'node #908\narrival_date <= 24.5\ngini = 0.5\nsamples = 242\nvalue = [120.034, 122.967]\nclass = y[1]'),
 Text(0.1456015702897832, 0.7361111111111112, 'node #909\ntotal_nights <= 3.5\ngini = 0.485\nsamples = 182\nvalue = [79.774, 113.859]\nclass = y[1]'),
 Text(0.14391016773075377, 0.7083333333333334, 'node #910\narrival_date <= 23.5\ngini = 0.448\nsamples = 141\nvalue = [53.68, 104.75]\nclass = y[1]'),
 Text(0.1421978836586499, 0.6805555555555556, 'node #911\navg_price_per_room <= 94.25\ngini = 0.484\nsamples = 121\nvalue = [52.934, 75.906]\nclass = y[1]'),
 Text(0.13977562814201516, 0.6527777777777778, 'node #912\navg_price_per_room <= 67.375\ngini = 0.477\nsamples = 62\nvalue = [35.041, 22.772]\nclass = y[0]'),
 Text(0.13894036761903766, 0.625, 'node #913\navg_price_per_room <= 52.125\ngini = 0.242\nsamples = 4\nvalue = [0.746, 4.554]\nclass = y[1]'),
 Text(0.13860626340984666, 0.5972222222222222, 'node #914\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.13927447182822866, 0.5972222222222222, 'node #915\ngini = -0.0\nsamples = 3\nvalue = [0.0, 4.554]\nclass = y[1]'),
 Text(0.14061088866499266, 0.625, 'node #916\navg_price_per_room <= 81.6\ngini = 0.453\nsamples = 58\nvalue = [34.295, 18.217]\nclass = y[0]'),
 Text(0.13994268024661066, 0.5972222222222222, 'node #917\narrival_month <= 3.5\ngini = 0.21\nsamples = 16\nvalue = [11.183, 1.518]\nclass = y[0]'),
 Text(0.13960857603741966, 0.5694444444444444, 'node #918\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.14027678445580166, 0.5694444444444444, 'node #919\ngini = 0.0\nsamples = 15\nvalue = [11.183, 0.0]\nclass = y[0]'),
 Text(0.14127909708337466, 0.5972222222222222, 'node #920\nroom_type_reserved_Room_Type 4 <= 0.5\ngini = 0.487\nsamples = 42\nvalue = [23.112, 16.699]\nclass = y[0]'),
 Text(0.14094499287418366, 0.5694444444444444, 'node #921\navg_price_per_room <= 86.43\ngini = 0.479\nsamples = 41\nvalue = [23.112, 15.181]\nclass = y[0]'),
 Text(0.14027678445580166, 0.5416666666666666, 'node #922\nlead_time <= 129.5\ngini = 0.5\nsamples = 19\nvalue = [9.692, 9.109]\nclass = y[0]'),
 Text(0.13994268024661066, 0.5138888888888888, 'node #923\narrival_month <= 5.0\ngini = 0.493\nsamples = 18\nvalue = [9.692, 7.591]\nclass = y[0]'),
 Text(0.13960857603741966, 0.4861111111111111, 'node #924\ntotal_nights <= 1.5\ngini = 0.499\nsamples = 16\nvalue = [8.201, 7.591]\nclass = y[0]'),
 Text(0.13927447182822866, 0.4583333333333333, 'node #925\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.13994268024661066, 0.4583333333333333, 'node #926\ngini = 0.5\nsamples = 15\nvalue = [7.456, 7.591]\nclass = y[1]'),
 Text(0.14027678445580166, 0.4861111111111111, 'node #927\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.14061088866499266, 0.5138888888888888, 'node #928\ngini = -0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.14161320129256566, 0.5416666666666666, 'node #929\narrival_year <= 2017.5\ngini = 0.429\nsamples = 22\nvalue = [13.42, 6.072]\nclass = y[0]'),
 Text(0.14127909708337466, 0.5138888888888888, 'node #930\ntotal_nights <= 2.0\ngini = 0.456\nsamples = 19\nvalue = [11.183, 6.072]\nclass = y[0]'),
 Text(0.14094499287418366, 0.4861111111111111, 'node #931\ngini = 0.447\nsamples = 10\nvalue = [5.964, 3.036]\nclass = y[0]'),
 Text(0.14161320129256566, 0.4861111111111111, 'node #932\ngini = 0.465\nsamples = 9\nvalue = [5.219, 3.036]\nclass = y[0]'),
 Text(0.14194730550175666, 0.5138888888888888, 'node #933\ngini = 0.0\nsamples = 3\nvalue = [2.237, 0.0]\nclass = y[0]'),
 Text(0.14161320129256566, 0.5694444444444444, 'node #934\ngini = -0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.14462013917528463, 0.6527777777777778, 'node #935\nroom_type_reserved_Room_Type 5 <= 0.5\ngini = 0.377\nsamples = 59\nvalue = [17.893, 53.134]\nclass = y[1]'),
 Text(0.14428603496609366, 0.625, 'node #936\navg_price_per_room <= 130.54\ngini = 0.36\nsamples = 57\nvalue = [16.402, 53.134]\nclass = y[1]'),
 Text(0.14395193075690266, 0.5972222222222222, 'node #937\nroom_type_reserved_Room_Type 4 <= 0.5\ngini = 0.352\nsamples = 56\nvalue = [15.657, 53.134]\nclass = y[1]'),
 Text(0.14361782654771166, 0.5694444444444444, 'node #938\nno_of_adults <= 2.5\ngini = 0.342\nsamples = 55\nvalue = [14.911, 53.134]\nclass = y[1]'),
 Text(0.14294961812932966, 0.5416666666666666, 'node #939\navg_price_per_room <= 95.25\ngini = 0.328\nsamples = 52\nvalue = [13.42, 51.616]\nclass = y[1]'),
 Text(0.14261551392013866, 0.5138888888888888, 'node #940\narrival_date <= 11.5\ngini = 0.352\nsamples = 48\nvalue = [13.42, 45.543]\nclass = y[1]'),
 Text(0.14228140971094766, 0.4861111111111111, 'node #941\ngini = 0.257\nsamples = 15\nvalue = [2.982, 16.699]\nclass = y[1]'),
 Text(0.14294961812932966, 0.4861111111111111, 'node #942\nlead_time <= 134.5\ngini = 0.39\nsamples = 33\nvalue = [10.438, 28.844]\nclass = y[1]'),
 Text(0.14261551392013866, 0.4583333333333333, 'node #943\ngini = 0.423\nsamples = 17\nvalue = [5.964, 13.663]\nclass = y[1]'),
 Text(0.14328372233852066, 0.4583333333333333, 'node #944\ngini = 0.352\nsamples = 16\nvalue = [4.473, 15.181]\nclass = y[1]'),
 Text(0.14328372233852066, 0.5138888888888888, 'node #945\ngini = 0.0\nsamples = 4\nvalue = [0.0, 6.072]\nclass = y[1]'),
 Text(0.14428603496609366, 0.5416666666666666, 'node #946\narrival_month <= 7.5\ngini = 0.5\nsamples = 3\nvalue = [1.491, 1.518]\nclass = y[1]'),
 Text(0.14395193075690266, 0.5138888888888888, 'node #947\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.14462013917528463, 0.5138888888888888, 'node #948\ngini = 0.442\nsamples = 2\nvalue = [0.746, 1.518]\nclass = y[1]'),
 Text(0.14428603496609366, 0.5694444444444444, 'node #949\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.14462013917528463, 0.5972222222222222, 'node #950\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.14495424338447563, 0.625, 'node #951\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.14562245180285763, 0.6805555555555556, 'node #952\navg_price_per_room <= 72.5\ngini = 0.049\nsamples = 20\nvalue = [0.746, 28.844]\nclass = y[1]'),
 Text(0.14528834759366663, 0.6527777777777778, 'node #953\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.14595655601204863, 0.6527777777777778, 'node #954\ngini = 0.0\nsamples = 19\nvalue = [0.0, 28.844]\nclass = y[1]'),
 Text(0.14729297284881263, 0.7083333333333334, 'node #955\navg_price_per_room <= 74.125\ngini = 0.384\nsamples = 41\nvalue = [26.094, 9.109]\nclass = y[0]'),
 Text(0.14695886863962163, 0.6805555555555556, 'node #956\narrival_month <= 6.5\ngini = 0.478\nsamples = 14\nvalue = [5.964, 9.109]\nclass = y[1]'),
 Text(0.14662476443043063, 0.6527777777777778, 'node #957\ngini = 0.0\nsamples = 4\nvalue = [2.982, 0.0]\nclass = y[0]'),
 Text(0.14729297284881263, 0.6527777777777778, 'node #958\navg_price_per_room <= 65.25\ngini = 0.372\nsamples = 10\nvalue = [2.982, 9.109]\nclass = y[1]'),
 Text(0.14695886863962163, 0.625, 'node #959\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.14762707705800363, 0.625, 'node #960\nlead_time <= 145.5\ngini = 0.317\nsamples = 9\nvalue = [2.237, 9.109]\nclass = y[1]'),
 Text(0.14729297284881263, 0.5972222222222222, 'node #961\narrival_month <= 9.0\ngini = 0.242\nsamples = 8\nvalue = [1.491, 9.109]\nclass = y[1]'),
 Text(0.14695886863962163, 0.5694444444444444, 'node #962\ntotal_nights <= 5.5\ngini = 0.372\nsamples = 5\nvalue = [1.491, 4.554]\nclass = y[1]'),
 Text(0.14662476443043063, 0.5416666666666666, 'node #963\ngini = 0.0\nsamples = 2\nvalue = [0.0, 3.036]\nclass = y[1]'),
 Text(0.14729297284881263, 0.5416666666666666, 'node #964\nlead_time <= 126.5\ngini = 0.5\nsamples = 3\nvalue = [1.491, 1.518]\nclass = y[1]'),
 Text(0.14695886863962163, 0.5138888888888888, 'node #965\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.14762707705800363, 0.5138888888888888, 'node #966\ngini = 0.0\nsamples = 2\nvalue = [1.491, 0.0]\nclass = y[0]'),
 Text(0.14762707705800363, 0.5694444444444444, 'node #967\ngini = 0.0\nsamples = 3\nvalue = [0.0, 4.554]\nclass = y[1]'),
 Text(0.14796118126719462, 0.5972222222222222, 'node #968\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.14762707705800363, 0.6805555555555556, 'node #969\ngini = -0.0\nsamples = 27\nvalue = [20.13, 0.0]\nclass = y[0]'),
 Text(0.14929759810395862, 0.7361111111111112, 'node #970\nroom_type_reserved_Room_Type 5 <= 0.5\ngini = 0.301\nsamples = 60\nvalue = [40.26, 9.109]\nclass = y[0]'),
 Text(0.14896349389476762, 0.7083333333333334, 'node #971\navg_price_per_room <= 57.25\ngini = 0.183\nsamples = 57\nvalue = [40.26, 4.554]\nclass = y[0]'),
 Text(0.14862938968557662, 0.6805555555555556, 'node #972\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.14929759810395862, 0.6805555555555556, 'node #973\narrival_date <= 27.5\ngini = 0.13\nsamples = 56\nvalue = [40.26, 3.036]\nclass = y[0]'),
 Text(0.14896349389476762, 0.6527777777777778, 'node #974\ngini = 0.0\nsamples = 26\nvalue = [19.384, 0.0]\nclass = y[0]'),
 Text(0.14963170231314962, 0.6527777777777778, 'node #975\ntotal_nights <= 3.5\ngini = 0.222\nsamples = 30\nvalue = [20.875, 3.036]\nclass = y[0]'),
 Text(0.14896349389476762, 0.625, 'node #976\ntype_of_meal_plan_Meal Plan 2 <= 0.5\ngini = 0.161\nsamples = 22\nvalue = [15.657, 1.518]\nclass = y[0]'),
 Text(0.14862938968557662, 0.5972222222222222, 'node #977\ngini = 0.0\nsamples = 8\nvalue = [5.964, 0.0]\nclass = y[0]'),
 Text(0.14929759810395862, 0.5972222222222222, 'node #978\ngini = 0.234\nsamples = 14\nvalue = [9.692, 1.518]\nclass = y[0]'),
 Text(0.1502999107315316, 0.625, 'node #979\navg_price_per_room <= 79.0\ngini = 0.349\nsamples = 8\nvalue = [5.219, 1.518]\nclass = y[0]'),
 Text(0.14996580652234062, 0.5972222222222222, 'node #980\ngini = 0.0\nsamples = 1\nvalue = [0.746, 0.0]\nclass = y[0]'),
 Text(0.1506340149407226, 0.5972222222222222, 'node #981\ngini = 0.378\nsamples = 7\nvalue = [4.473, 1.518]\nclass = y[0]'),
 Text(0.14963170231314962, 0.7083333333333334, 'node #982\ngini = -0.0\nsamples = 3\nvalue = [0.0, 4.554]\nclass = y[1]'),
 Text(0.1431949759079543, 0.7916666666666666, 'node #983\ngini = 0.0\nsamples = 65\nvalue = [48.461, 0.0]\nclass = y[0]'),
 Text(0.2399767716044437, 0.9027777777777778, 'node #984\nlead_time <= 13.5\ngini = 0.426\nsamples = 5272\nvalue = [1866.861, 4202.144]\nclass = y[1]'),
 Text(0.18240066925249404, 0.875, 'node #985\navg_price_per_room <= 99.445\ngini = 0.472\nsamples = 1413\nvalue = [808.924, 497.942]\nclass = y[0]'),
 Text(0.16937125764132868, 0.8472222222222222, 'node #986\narrival_month <= 1.5\ngini = 0.348\nsamples = 699\nvalue = [456.278, 132.076]\nclass = y[0]'),
 Text(0.16903715343213768, 0.8194444444444444, 'node #987\ngini = 0.0\nsamples = 124\nvalue = [92.448, 0.0]\nclass = y[0]'),
 Text(0.16970536185051968, 0.8194444444444444, 'node #988\narrival_month <= 8.5\ngini = 0.391\nsamples = 575\nvalue = [363.829, 132.076]\nclass = y[0]'),
 Text(0.1617977938681437, 0.7916666666666666, 'node #989\ntotal_nights <= 2.5\ngini = 0.466\nsamples = 341\nvalue = [197.571, 115.377]\nclass = y[0]'),
 Text(0.15763454219892772, 0.7638888888888888, 'node #990\nlead_time <= 5.5\ngini = 0.417\nsamples = 262\nvalue = [161.785, 68.315]\nclass = y[0]'),
 Text(0.15411078686761642, 0.7361111111111112, 'node #991\nno_of_adults <= 1.5\ngini = 0.342\nsamples = 190\nvalue = [124.507, 34.917]\nclass = y[0]'),
 Text(0.1526386401958686, 0.7083333333333334, 'node #992\narrival_month <= 2.5\ngini = 0.063\nsamples = 62\nvalue = [45.479, 1.518]\nclass = y[0]'),
 Text(0.1523045359866776, 0.6805555555555556, 'node #993\nlead_time <= 0.5\ngini = 0.149\nsamples = 24\nvalue = [17.148, 1.518]\nclass = y[0]'),
 Text(0.1519704317774866, 0.6527777777777778, 'node #994\narrival_date <= 12.0\ngini = 0.301\nsamples = 10\nvalue = [6.71, 1.518]\nclass = y[0]'),
 Text(0.1516363275682956, 0.625, 'node #995\navg_price_per_room <= 82.0\ngini = 0.447\nsamples = 5\nvalue = [2.982, 1.518]\nclass = y[0]'),
 Text(0.1513022233591046, 0.5972222222222222, 'node #996\ngini = 0.0\nsamples = 4\nvalue = [2.982, 0.0]\nclass = y[0]'),
 Text(0.1519704317774866, 0.5972222222222222, 'node #997\ngini = 0.0\nsamples = 1\nvalue = [0.0, 1.518]\nclass = y[1]'),
 Text(0.1523045359866776, 0.625, 'node #998\ngini = 0.0\nsamples = 5\nvalue = [3.728, 0.0]\nclass = y[0]'),
 Text(0.1526386401958686, 0.6527777777777778, 'node #999\ngini = -0.0\nsamples = 14\nvalue = [10.438, 0.0]\nclass = y[0]'),
 ...]
In [185]:
#Top-10 most important features in the decision tree
#The importance of a feature is computed as the (normalized) total reduction of the criterion brought by that feature

print(pd.DataFrame(decisiontree.feature_importances_, columns=["Imp"], index=X_train.columns).sort_values(
    by="Imp", ascending=False).head(n=10))
                             Imp
lead_time                   0.36
avg_price_per_room          0.15
market_segment_type_Online  0.09
arrival_date                0.09
no_of_special_requests      0.09
arrival_month               0.07
total_nights                0.06
no_of_adults                0.03
arrival_year                0.02
market_segment_type_Offline 0.01
In [186]:
#visualization of feature importance
importances = decisiontree.feature_importances_
indices = np.argsort(importances)

plt.figure(figsize=(12, 12))
plt.title("Feature Importances")
plt.barh(range(len(indices)), importances[indices], align="center")
plt.yticks(range(len(indices)), [feature_names[i] for i in indices])
plt.xlabel("Relative Importance")
plt.show()
Out[186]:
<Figure size 1200x1200 with 0 Axes>
Out[186]:
Text(0.5, 1.0, 'Feature Importances')
Out[186]:
<BarContainer object of 26 artists>
Out[186]:
([<matplotlib.axis.YTick at 0x7cf38840e170>,
  <matplotlib.axis.YTick at 0x7cf38840cee0>,
  <matplotlib.axis.YTick at 0x7cf38840ceb0>,
  <matplotlib.axis.YTick at 0x7cf37b57c100>,
  <matplotlib.axis.YTick at 0x7cf37b57fd90>,
  <matplotlib.axis.YTick at 0x7cf37b57cb50>,
  <matplotlib.axis.YTick at 0x7cf37b57c280>,
  <matplotlib.axis.YTick at 0x7cf37b57cf10>,
  <matplotlib.axis.YTick at 0x7cf3823cba90>,
  <matplotlib.axis.YTick at 0x7cf3823cb970>,
  <matplotlib.axis.YTick at 0x7cf37b57c7c0>,
  <matplotlib.axis.YTick at 0x7cf3823ca020>,
  <matplotlib.axis.YTick at 0x7cf37b57c520>,
  <matplotlib.axis.YTick at 0x7cf3823cbf40>,
  <matplotlib.axis.YTick at 0x7cf37b911870>,
  <matplotlib.axis.YTick at 0x7cf3823cad10>,
  <matplotlib.axis.YTick at 0x7cf37b57e980>,
  <matplotlib.axis.YTick at 0x7cf3813fd810>,
  <matplotlib.axis.YTick at 0x7cf3813fd480>,
  <matplotlib.axis.YTick at 0x7cf3813fd420>,
  <matplotlib.axis.YTick at 0x7cf3813fdba0>,
  <matplotlib.axis.YTick at 0x7cf3823c9e40>,
  <matplotlib.axis.YTick at 0x7cf3813fceb0>,
  <matplotlib.axis.YTick at 0x7cf3813fd270>,
  <matplotlib.axis.YTick at 0x7cf3813fe770>,
  <matplotlib.axis.YTick at 0x7cf3813ff550>],
 [Text(0, 0, 'market_segment_type_Complementary'),
  Text(0, 1, 'type_of_meal_plan_Meal Plan 3'),
  Text(0, 2, 'room_type_reserved_Room_Type 3'),
  Text(0, 3, 'no_of_previous_bookings_not_canceled'),
  Text(0, 4, 'no_of_previous_cancellations'),
  Text(0, 5, 'room_type_reserved_Room_Type 7'),
  Text(0, 6, 'market_segment_type_Corporate'),
  Text(0, 7, 'room_type_reserved_Room_Type 6'),
  Text(0, 8, 'repeated_guest'),
  Text(0, 9, 'room_type_reserved_Room_Type 5'),
  Text(0, 10, 'room_type_reserved_Room_Type 2'),
  Text(0, 11, 'type_of_meal_plan_Meal Plan 2'),
  Text(0, 12, 'no_of_children'),
  Text(0, 13, 'type_of_meal_plan_Not Selected'),
  Text(0, 14, 'room_type_reserved_Room_Type 4'),
  Text(0, 15, 'required_car_parking_space'),
  Text(0, 16, 'market_segment_type_Offline'),
  Text(0, 17, 'arrival_year'),
  Text(0, 18, 'no_of_adults'),
  Text(0, 19, 'total_nights'),
  Text(0, 20, 'arrival_month'),
  Text(0, 21, 'no_of_special_requests'),
  Text(0, 22, 'arrival_date'),
  Text(0, 23, 'market_segment_type_Online'),
  Text(0, 24, 'avg_price_per_room'),
  Text(0, 25, 'lead_time')])
Out[186]:
Text(0.5, 0, 'Relative Importance')

lead_time is the highest variable in the feature imprtance.

Using GridSearch for Hyperparameter tuning of our tree model

  • Hyperparameter tuning for our tree model using GridSearch is a crucial step.
  • However, it can be challenging because there’s no direct formula to calculate how adjusting a hyperparameter value will impact the model’s loss.
  • Instead, we rely on experimentation to find the optimal hyperparameter values i.e we'll use Grid search
  • Grid search exhaustively explores specific parameter values and optimizes the model by cross-validating over a parameter grid.
In [187]:
# Choose the type of classifier.
estimator = DecisionTreeClassifier(random_state=1, class_weight="balanced")

# Grid of parameters to choose from
parameters = {
    "max_depth": np.arange(4, 13, 4),  # [4, 8, 12]
    "criterion": ["entropy", "gini"],
    "splitter": ["best", "random"],
    "min_impurity_decrease": [0.00001, 0.0001, 0.01, .1, 1],
    "max_leaf_nodes": [50, 75, 150, 250],
    "min_samples_split": [10, 30, 50, 70],

}

# Type of scoring used to compare parameter combinations
accuracy_scorer = make_scorer(recall_score)

# Run the grid search
grid_obj = GridSearchCV(estimator, parameters, scoring=accuracy_scorer, cv=5)
grid_obj = grid_obj.fit(X_train1, y_train)

# Set the clf to the best combination of parameters
estimator = grid_obj.best_estimator_

# Fit the best algorithm to the data.
estimator.fit(X_train, y_train)
Out[187]:
DecisionTreeClassifier(class_weight='balanced', criterion='entropy',
                       max_depth=4, max_leaf_nodes=50,
                       min_impurity_decrease=1e-05, min_samples_split=10,
                       random_state=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
DecisionTreeClassifier(class_weight='balanced', criterion='entropy',
                       max_depth=4, max_leaf_nodes=50,
                       min_impurity_decrease=1e-05, min_samples_split=10,
                       random_state=1)
In [188]:
confusion_matrix_sklearn(estimator, X_train, y_train)
In [189]:
decision_tree_tune_perf_train = model_performance_classification_sklearn(
    estimator, X_train, y_train
)
decision_tree_tune_perf_train
Out[189]:
Accuracy Recall Precision F1
0 0.81 0.72 0.70 0.71
In [190]:
confusion_matrix_sklearn(estimator, X_test, y_test)
In [191]:
decision_tree_tune_perf_test = model_performance_classification_sklearn(
    estimator, X_test, y_test
)
decision_tree_tune_perf_test
Out[191]:
Accuracy Recall Precision F1
0 0.81 0.72 0.71 0.71

The training and the test have similar results, however the F1 scores are .71 and .71 which is very low.
Also the recall score is 0.72 and 0.72.
Recall on training set went from .98 to .72, but this is an improvement because now the model has less overfitting.
There is still more work to do.

Visualing the Decison Tree

In [192]:
plt.figure(figsize=(35, 10))
out = tree.plot_tree(
    estimator,
    feature_names=feature_names,
    filled=True,
    fontsize=9,
    node_ids=False,
    class_names=None,
)
for o in out:
    arrow = o.arrow_patch
    if arrow is not None:
        arrow.set_edgecolor("black")
        arrow.set_linewidth(1)
plt.show()
Out[192]:
<Figure size 3500x1000 with 0 Axes>

This decisontree looks better than the previous one.
Still has several nodes but way easier to read.

In [193]:
#Print the top-10 most important features in the decision tree
#The importance of a feature is computed as the (normalized) total reduction of the criterion brought by that feature

print(pd.DataFrame(estimator.feature_importances_, columns=["Imp"], index=X_train.columns).sort_values(
    by="Imp", ascending=False).head(n=10))
                                   Imp
lead_time                         0.46
market_segment_type_Online        0.22
no_of_special_requests            0.20
avg_price_per_room                0.08
arrival_month                     0.02
market_segment_type_Offline       0.01
type_of_meal_plan_Not Selected    0.00
market_segment_type_Corporate     0.00
market_segment_type_Complementary 0.00
room_type_reserved_Room_Type 7    0.00

The pre-pruned decision tree model shows lead_time and market_segment_type_online are the two most important variables for predicting a booking's cancellation.
The third most important variable, no_of_special_requests.

Cost Complexity Pruning

Minimal cost complexity pruning identifies the node with the ‘weakest link’ in a decision tree. This ‘weakest link’ is characterized by an effective alpha, where nodes with the smallest effective alpha are pruned first. To determine suitable values for the pruning parameter (ccp_alpha), scikit-learn provides the DecisionTreeClassifier.cost_complexity_pruning_path function. This function returns the effective alphas and corresponding total leaf impurities at each step of the pruning process. As the alpha value increases, more of the tree is pruned, leading to increased total impurity in its leaves.

In summary, cost complexity pruning helps control the size of decision trees by selectively removing nodes based on their impact on model complexity and impurity reduction. By adjusting the ccp_alpha parameter, you can strike a balance between model accuracy and simplicity.

In [194]:
# Set the classifier first
cif = DecisionTreeClassifier(random_state=1, class_weight="balanced")

# Compute the pruning for training data
path = cif.cost_complexity_pruning_path(X_train, y_train)

# Come up with all the ccp alphas and corresponding impurities
ccp_alphas, impurities = path.ccp_alphas, path.impurities

'''
Explanation of ccp_alphas: These values represent different thresholds of cost complexity.
Each value corresponds to a point where a split in the decision tree will be pruned if it doesn't improve the model's overall complexity cost by at least that amount.
The array is sorted in increasing order. Starting with the smallest alpha (the least complex tree),
each subsequent alpha increases the penalty for complexity, resulting in a simpler (more pruned) tree.


The goal is to find the ccp_alpha value that maximizes performance on the validation or test set, which may not necessarily be the highest ccp_alpha.
The optimal ccp_alpha achieves the best trade-off between overfitting and underfitting, leading to a model that generalizes well to new data.


Explanation of impurities: This array provides the total impurity of the tree at each level of pruning defined by ccp_alphas.
Impurity is a measure of how mixed the classes are in the leaves of the tree. As pruning increases (with larger ccp_alpha values),
the impurity might initially decrease, as overfitting reduces, but then can increase if the model becomes too simple and underfits.

'''
Out[194]:
"\nExplanation of ccp_alphas: These values represent different thresholds of cost complexity.\nEach value corresponds to a point where a split in the decision tree will be pruned if it doesn't improve the model's overall complexity cost by at least that amount.\nThe array is sorted in increasing order. Starting with the smallest alpha (the least complex tree),\neach subsequent alpha increases the penalty for complexity, resulting in a simpler (more pruned) tree.\n\n\nThe goal is to find the ccp_alpha value that maximizes performance on the validation or test set, which may not necessarily be the highest ccp_alpha.\nThe optimal ccp_alpha achieves the best trade-off between overfitting and underfitting, leading to a model that generalizes well to new data.\n\n\nExplanation of impurities: This array provides the total impurity of the tree at each level of pruning defined by ccp_alphas.\nImpurity is a measure of how mixed the classes are in the leaves of the tree. As pruning increases (with larger ccp_alpha values),\nthe impurity might initially decrease, as overfitting reduces, but then can increase if the model becomes too simple and underfits.\n\n"
In [195]:
pd.DataFrame(path)
Out[195]:
ccp_alphas impurities
0 0.00 0.01
1 0.00 0.01
2 0.00 0.01
3 0.00 0.01
4 0.00 0.01
... ... ...
1843 0.01 0.33
1844 0.01 0.34
1845 0.01 0.35
1846 0.03 0.42
1847 0.08 0.50

1848 rows × 2 columns

In [196]:
fig, ax = plt.subplots(figsize=(10, 6))
ax.plot(ccp_alphas[:-1], impurities[:-1], marker="o", drawstyle="steps-post")
ax.set_xlabel("effective alpha")
ax.set_ylabel("Total impurity of leaves")
ax.set_title("Total impurity vs Effective alpha for training set")
plt.show()
Out[196]:
[<matplotlib.lines.Line2D at 0x7cf37b481d20>]
Out[196]:
Text(0.5, 0, 'effective alpha')
Out[196]:
Text(0, 0.5, 'Total impurity of leaves')
Out[196]:
Text(0.5, 1.0, 'Total impurity vs Effective alpha for training set')

Next, we train a decision tree using the effective alphas. The last value in ccp_alphas is the alpha value that prunes the whole tree, leaving the tree, clfs[-1], with one node.

In [197]:
clfs = []
for ccp_alpha in ccp_alphas:
    clf = DecisionTreeClassifier(
        random_state=1, ccp_alpha=ccp_alpha, class_weight="balanced")
    clf.fit(X_train, y_train) ## Complete the code to fit decision tree on training data
    clfs.append(clf)
print(
    "Number of nodes in the last tree is: {} with ccp_alpha: {}".format(
        clfs[-1].tree_.node_count, ccp_alphas[-1])
)
Output hidden; open in https://colab.research.google.com to view.

For the remainder, we remove the last element in clfs and ccp_alphas, because it is the trivial tree with only one node. Here we show that the number of nodes and tree depth decreases as alpha increases.

In [198]:
clfs = clfs[:-1]
ccp_alphas = ccp_alphas[:-1]

node_counts = [clf.tree_.node_count for clf in clfs]
depth = [clf.tree_.max_depth for clf in clfs]
fig, ax = plt.subplots(2, 1, figsize=(10, 7))
ax[0].plot(ccp_alphas, node_counts, marker="o", drawstyle="steps-post")
ax[0].set_xlabel("Alpha")
ax[0].set_ylabel("Number of nodes")
ax[0].set_title("Number of nodes vs Alpha")
ax[1].plot(ccp_alphas, depth, marker="o", drawstyle="steps-post")
ax[1].set_xlabel("Alpha")
ax[1].set_ylabel("Depth of tree")
ax[1].set_title("Depth vs Alpha")
fig.tight_layout()
Out[198]:
[<matplotlib.lines.Line2D at 0x7cf37b242f50>]
Out[198]:
Text(0.5, 0, 'Alpha')
Out[198]:
Text(0, 0.5, 'Number of nodes')
Out[198]:
Text(0.5, 1.0, 'Number of nodes vs Alpha')
Out[198]:
[<matplotlib.lines.Line2D at 0x7cf37b2ff130>]
Out[198]:
Text(0.5, 0, 'Alpha')
Out[198]:
Text(0, 0.5, 'Depth of tree')
Out[198]:
Text(0.5, 1.0, 'Depth vs Alpha')
In [199]:
f1_train = []
for clf in clfs:
    pred_train = clf.predict(X_train)
    values_train = f1_score(y_train, pred_train)
    f1_train.append(values_train)
In [200]:
f1_test = []
for clf in clfs:
    pred_test = clf.predict(X_test)
    values_test = f1_score(y_test, pred_test)
    f1_test.append(values_test)
In [201]:
train_scores = [clf.score(X_train, y_train) for clf in clfs]
test_scores = [clf.score(X_test, y_test) for clf in clfs]
In [202]:
fig, ax = plt.subplots(figsize=(15, 5))
ax.set_xlabel("alpha")
ax.set_ylabel("F1")
ax.set_title("F1 vs Alpha for training and testing sets")
ax.plot(
    ccp_alphas, f1_train, marker="o", label="train", drawstyle="steps-post",
)
ax.plot(ccp_alphas, f1_test, marker="o", label="test", drawstyle="steps-post")
ax.legend()
plt.show()
Out[202]:
Text(0.5, 0, 'alpha')
Out[202]:
Text(0, 0.5, 'F1')
Out[202]:
Text(0.5, 1.0, 'F1 vs Alpha for training and testing sets')
Out[202]:
[<matplotlib.lines.Line2D at 0x7cf381af3070>]
Out[202]:
[<matplotlib.lines.Line2D at 0x7cf37b394250>]
Out[202]:
<matplotlib.legend.Legend at 0x7cf3887d94e0>

The F1 score for the training and testing are lining up almost perfectly.

In [203]:
recall_train = []
for clf in clfs:
    pred_train = clf.predict(X_train)
    values_train = recall_score(y_train, pred_train)
    recall_train.append(values_train)
In [204]:
recall_test = []
for clf in clfs:
    pred_test = clf.predict(X_test)
    values_test = recall_score(y_test, pred_test)
    recall_test.append(values_test)
In [205]:
train_scores = [clf.score(X_train, y_train) for clf in clfs]
test_scores = [clf.score(X_test, y_test) for clf in clfs]
In [206]:
fig, ax = plt.subplots(figsize=(15, 5))
ax.set_xlabel("alpha")
ax.set_ylabel("Recall")
ax.set_title("Recall vs alpha for training and testing sets")
ax.plot(
    ccp_alphas, recall_train, marker="o", label="train", drawstyle="steps-post",
)
ax.plot(ccp_alphas, recall_test, marker="o", label="test", drawstyle="steps-post")
ax.legend()
plt.show()
Out[206]:
Text(0.5, 0, 'alpha')
Out[206]:
Text(0, 0.5, 'Recall')
Out[206]:
Text(0.5, 1.0, 'Recall vs alpha for training and testing sets')
Out[206]:
[<matplotlib.lines.Line2D at 0x7cf37a5b9570>]
Out[206]:
[<matplotlib.lines.Line2D at 0x7cf37a5bb340>]
Out[206]:
<matplotlib.legend.Legend at 0x7cf37a5ba8f0>

The Recall score for the training and testing are lining up almost perfectly.

In [207]:
#create the model where we get highest train and test recall
index_post = np.argmax(f1_test)
decisiontree_post = clfs[index_post]
print(decisiontree_post)
DecisionTreeClassifier(ccp_alpha=0.00012535266224369257,
                       class_weight='balanced', random_state=1)
In [208]:
# creating the model where we get highest train and test recall
index_best_model = np.argmax(recall_test)
best_model = clfs[index_best_model]
print(best_model)
DecisionTreeClassifier(ccp_alpha=0.0001547772202137408, class_weight='balanced',
                       random_state=1)
In [209]:
decisiontree_post.fit(X_train, y_train)
Out[209]:
DecisionTreeClassifier(ccp_alpha=0.00012535266224369257,
                       class_weight='balanced', random_state=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
DecisionTreeClassifier(ccp_alpha=0.00012535266224369257,
                       class_weight='balanced', random_state=1)

Performance on Training Set

In [210]:
confusion_matrix_sklearn(best_model, X_train, y_train)
In [211]:
decision_tree_post_perf_train = model_performance_classification_sklearn(
    best_model, X_train, y_train
)
decision_tree_post_perf_train
Out[211]:
Accuracy Recall Precision F1
0 0.88 0.89 0.77 0.83

Performance on Test Set

In [212]:
confusion_matrix_sklearn(best_model, X_test, y_test)
In [213]:
decision_tree_post_perf_test = model_performance_classification_sklearn(
    best_model, X_test, y_test
)
decision_tree_post_perf_test
Out[213]:
Accuracy Recall Precision F1
0 0.86 0.86 0.74 0.79

The Fi score is 0.83 on the training set compared to 0.79 on the test set.
The Recall score is 0.89 on the training set compared to 0.86 on the test set.

In [214]:
plt.figure(figsize=(20, 10))

out = tree.plot_tree(
    best_model,
    feature_names=feature_names,
    filled=True,
    fontsize=9,
    node_ids=False,
    class_names=None,
)
for o in out:
    arrow = o.arrow_patch
    if arrow is not None:
        arrow.set_edgecolor("black")
        arrow.set_linewidth(1)
plt.show()
Out[214]:
<Figure size 2000x1000 with 0 Axes>

Tree is not getting less complex.

In [215]:
#Print the top-10 most important features in the decision tree
#The importance of a feature is computed as the (normalized) total reduction of the criterion brought by that feature

importances = decisiontree_post.feature_importances_
indices = np.argsort(importances)

print(pd.DataFrame(decisiontree_post.feature_importances_, columns=["Imp"], index=X_train.columns).sort_values(
    by="Imp", ascending=False).head(n=10))
                             Imp
lead_time                   0.40
market_segment_type_Online  0.14
no_of_special_requests      0.12
avg_price_per_room          0.12
arrival_month               0.06
arrival_date                0.04
total_nights                0.03
no_of_adults                0.03
arrival_year                0.02
market_segment_type_Offline 0.01
In [216]:
#visualization of feature importance
importances = decisiontree.feature_importances_
indices = np.argsort(importances)

plt.figure(figsize=(12, 12))
plt.title("Feature Importances")
plt.barh(range(len(indices)), importances[indices], color="aqua", align="center")
plt.yticks(range(len(indices)), [feature_names[i] for i in indices])
plt.xlabel("Relative Importance")
plt.show()
Out[216]:
<Figure size 1200x1200 with 0 Axes>
Out[216]:
Text(0.5, 1.0, 'Feature Importances')
Out[216]:
<BarContainer object of 26 artists>
Out[216]:
([<matplotlib.axis.YTick at 0x7cf3815c7a90>,
  <matplotlib.axis.YTick at 0x7cf3815c4a90>,
  <matplotlib.axis.YTick at 0x7cf381b25c30>,
  <matplotlib.axis.YTick at 0x7cf388a21810>,
  <matplotlib.axis.YTick at 0x7cf37b54c490>,
  <matplotlib.axis.YTick at 0x7cf388a20760>,
  <matplotlib.axis.YTick at 0x7cf38826ca30>,
  <matplotlib.axis.YTick at 0x7cf37bd49270>,
  <matplotlib.axis.YTick at 0x7cf37bd49ab0>,
  <matplotlib.axis.YTick at 0x7cf37b98c760>,
  <matplotlib.axis.YTick at 0x7cf37bd4a740>,
  <matplotlib.axis.YTick at 0x7cf38826f370>,
  <matplotlib.axis.YTick at 0x7cf37b54fc10>,
  <matplotlib.axis.YTick at 0x7cf37b9c6dd0>,
  <matplotlib.axis.YTick at 0x7cf37b9c7550>,
  <matplotlib.axis.YTick at 0x7cf37b972a70>,
  <matplotlib.axis.YTick at 0x7cf37b9c7eb0>,
  <matplotlib.axis.YTick at 0x7cf37b972c80>,
  <matplotlib.axis.YTick at 0x7cf37b6866e0>,
  <matplotlib.axis.YTick at 0x7cf37b686380>,
  <matplotlib.axis.YTick at 0x7cf37bac16f0>,
  <matplotlib.axis.YTick at 0x7cf37b686da0>,
  <matplotlib.axis.YTick at 0x7cf38826dfc0>,
  <matplotlib.axis.YTick at 0x7cf37b685fc0>,
  <matplotlib.axis.YTick at 0x7cf37b5befb0>,
  <matplotlib.axis.YTick at 0x7cf37b5be410>],
 [Text(0, 0, 'market_segment_type_Complementary'),
  Text(0, 1, 'type_of_meal_plan_Meal Plan 3'),
  Text(0, 2, 'room_type_reserved_Room_Type 3'),
  Text(0, 3, 'no_of_previous_bookings_not_canceled'),
  Text(0, 4, 'no_of_previous_cancellations'),
  Text(0, 5, 'room_type_reserved_Room_Type 7'),
  Text(0, 6, 'market_segment_type_Corporate'),
  Text(0, 7, 'room_type_reserved_Room_Type 6'),
  Text(0, 8, 'repeated_guest'),
  Text(0, 9, 'room_type_reserved_Room_Type 5'),
  Text(0, 10, 'room_type_reserved_Room_Type 2'),
  Text(0, 11, 'type_of_meal_plan_Meal Plan 2'),
  Text(0, 12, 'no_of_children'),
  Text(0, 13, 'type_of_meal_plan_Not Selected'),
  Text(0, 14, 'room_type_reserved_Room_Type 4'),
  Text(0, 15, 'required_car_parking_space'),
  Text(0, 16, 'market_segment_type_Offline'),
  Text(0, 17, 'arrival_year'),
  Text(0, 18, 'no_of_adults'),
  Text(0, 19, 'total_nights'),
  Text(0, 20, 'arrival_month'),
  Text(0, 21, 'no_of_special_requests'),
  Text(0, 22, 'arrival_date'),
  Text(0, 23, 'market_segment_type_Online'),
  Text(0, 24, 'avg_price_per_room'),
  Text(0, 25, 'lead_time')])
Out[216]:
Text(0.5, 0, 'Relative Importance')
In [217]:
# Choose the type of classifier.
estimator_updated = DecisionTreeClassifier(random_state=1) # random forest, xgboost, svm

# Grid of parameters to choose from
parameters = {
    "class_weight": [None, "balanced"],
    "max_depth": np.arange(3, 13, 3),  # [3, 9, 12)
    "criterion": ["entropy", "gini"],
    "splitter": ["best", "random"],
    "min_impurity_decrease": [0.00001, 0.0001, 0.01, 0.1, 1],
    "max_leaf_nodes": [50, 75, 150, 250],
    "min_samples_split": [10, 30, 50, 70],
}

# Type of scoring used to compare parameter combinations
scorer = make_scorer(f1_score)

# Run the grid search
grid_obj = GridSearchCV(estimator_updated, parameters, scoring=scorer, cv=5)
grid_obj = grid_obj.fit(X_train, y_train)

# Set the clf to the best combination of parameters
estimator_updated = grid_obj.best_estimator_

# Fit the best algorithm to the data.
estimator_updated.fit(X_train, y_train)
Out[217]:
DecisionTreeClassifier(max_depth=12, max_leaf_nodes=250,
                       min_impurity_decrease=0.0001, min_samples_split=10,
                       random_state=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
DecisionTreeClassifier(max_depth=12, max_leaf_nodes=250,
                       min_impurity_decrease=0.0001, min_samples_split=10,
                       random_state=1)
In [218]:
confusion_matrix_sklearn(estimator_updated, X_train, y_train)
In [219]:
decision_tree_tune_perf_train_updated = model_performance_classification_sklearn(
    estimator_updated, X_train, y_train
)
decision_tree_tune_perf_train_updated
Out[219]:
Accuracy Recall Precision F1
0 0.89 0.81 0.84 0.83
In [220]:
confusion_matrix_sklearn(estimator_updated, X_test, y_test)
In [221]:
decision_tree_tune_perf_test_updated = model_performance_classification_sklearn(
    estimator_updated, X_test, y_test
)
decision_tree_tune_perf_test_updated
Out[221]:
Accuracy Recall Precision F1
0 0.88 0.79 0.83 0.81
In [222]:
plt.figure(figsize=(35, 10))
out = tree.plot_tree(
    estimator_updated,
    feature_names=feature_names,
    filled=True,
    fontsize=9,
    node_ids=False,
    class_names=None,
)
# below code will add arrows to the decision tree split if they are missing
for o in out:
    arrow = o.arrow_patch
    if arrow is not None:
        arrow.set_edgecolor("black")
        arrow.set_linewidth(1)
plt.show()
Out[222]:
<Figure size 3500x1000 with 0 Axes>

The model has a better F1 score on both the training and testing datasets than the logistic regression models.
This is a simpler model and appears to perform similarly well on both the training and test dataset, indicating that this model is not overfit to the training data and thereby should provide more generalizable predictions.

Model Performance Comparison and Conclusions¶

In [223]:
# training performance comparison

models_train_comp_df = pd.concat(
    [
        decision_tree_perf_train_without.T,
        decision_tree_perf_train.T,
        decision_tree_tune_perf_train.T,
        decision_tree_post_perf_train.T,
        decision_tree_tune_perf_train_updated.T,
    ],
    axis=1,
)
models_train_comp_df.columns = [
    "Decision Tree without class_weight",
    "Decision Tree with class_weight",
    "Decision Tree (Pre-Pruning)",
    "Decision Tree (Post-Pruning)",
    "Decision Tree (Readible Tree)",

]
print("Training performance comparison:")
models_train_comp_df
Training performance comparison:
Out[223]:
Decision Tree without class_weight Decision Tree with class_weight Decision Tree (Pre-Pruning) Decision Tree (Post-Pruning) Decision Tree (Readible Tree)
Accuracy 0.99 0.99 0.81 0.88 0.89
Recall 0.99 1.00 0.72 0.89 0.81
Precision 1.00 0.98 0.70 0.77 0.84
F1 0.99 0.99 0.71 0.83 0.83
In [225]:
# testing performance comparison

models_test_comp_df = pd.concat(
    [
        decision_tree_perf_test_without.T,
        decision_tree_perf_test.T,
        decision_tree_tune_perf_test.T,
        decision_tree_post_perf_test.T,
        decision_tree_tune_perf_test_updated.T,
    ],
    axis=1,
)
models_test_comp_df.columns = [
    "Decision Tree without class_weight",
    "Decision Tree with class_weight",
    "Decision Tree (Pre-Pruning)",
    "Decision Tree (Post-Pruning)",
    "Decision Tree (Readible Tree)",
]
print("Testing performance comparison:")
models_test_comp_df
Testing performance comparison:
Out[225]:
Decision Tree without class_weight Decision Tree with class_weight Decision Tree (Pre-Pruning) Decision Tree (Post-Pruning) Decision Tree (Readible Tree)
Accuracy 0.87 0.86 0.81 0.86 0.88
Recall 0.80 0.81 0.72 0.86 0.79
Precision 0.79 0.78 0.71 0.74 0.83
F1 0.80 0.79 0.71 0.79 0.81
  • Although based on presentation the pre-pruning decisiontree looks less complex, The post pruned tree is the best model since it is giving a slightly higher recall, accuracy, F1, and precision score on the train and test sets than the pre-pruned tree.
  • The most readible tree has a lower recall, accuracy, precision, and F1 score so it is not the best model.

Conclusions¶

  • We conducted an analysis of 36,275 booking cancellation decisions using five different Decision Tree classifiers to create a predictive model. These models can assist INN Hotels Group in predicting whether a booking will be canceled before the check-in date.

  • All five decision-tree models outperform the best-performing logistic regression model based on our objective criterion (F1 score), 0.69 pm the logistic regression test model and 0.71 (the lowest) decision tree (Pre-Pruning).

  • We visualized each model's decision tree and confusion matrix for better understanding. However, interpreting predictions from the original, pre-pruned, and post-pruned decision-tree models may be challenging for clients. For instance, the original and post-pruned decision tree is visually complex. The pre-pruned decision tree is slightly complex but is able to be read.

  • Despite efforts to reduce overfitting through tuning, both pre-pruning and post-pruning methods had minimal impact. Pre-pruning decisiontree looks better but could still have some overfitting that occurs.

  • The best-performing model (based on Recall), the post-pruned decision tree, has minimal performance gap between the training and test datasets:

    • Accuracy down .02 (Training .88 Test .86)
    • Recall down .03 (Training .89 Test .86)
    • Precision down .03 (Training .77 Test .74)
    • F1 down .04 (Training .83 Test .79)
  • The best-performing model (based on F1-Scre), the readible decision tree, has minimal performance gap between the training and test datasets:

    • Accuracy down .01 (Training .89 Test .88)
    • Recall down .02 (Training .81 Test .79)
    • Precision down .01 (Training .84 Test .83)
    • F1 down .02 (Training .83 Test .81)
  • INN Hotels should weigh the tradeoff between model performance, overfitting, and interpretability.

  • If a more understandable prediction model is desired, a max tree depth of 12 is recommended.

  • Alternatively, if INN Hotels prioritizes performance and is comfortable with a “black-box” model, either the post-pruning or the readible tree is a suitable choice.

Actionable Insights and Recommendations¶

  • What profitable policies for cancellations and refunds can the hotel adopt?
  • What other recommedations would you suggest to the hotel?

Our EDA and predictions from both models show:

Guest are less likely to cancel if:

  • They are a repeated guest
  • They have special requests
  • They have short lead times
  • They have special requests
  • They require a parking space
  • They book a room for January or December
  • They book a cheaper room
  • They are in the market segment:
    Corporate
    Complementary
  • Lowest 5 cancelation days:
    Day 2 - 23.14%
    Day 9 - 26.02%
    Day 14 - 26.33%
    Day 29 - 28.07%
    Day 5 - 28.42%

Guest are more likely to cancel if:

  • They have long lead times
  • They have no special requests
  • They book a more expensive room.
  • They are in the market segment: Online Offline
  • They book a room for June, July or August
  • Top 5 cancelation days:
    Day 15 - 42.26%
    Day 1 - 41.04%
    Day 30 - 38.24%
    Day 12 - 38.21%
    Day 26 - 37.09%

What profitable policies for cancellations and refunds can the hotel adopt?

  • Considering the coefficients in the logistic regression models and the features in the decision-tree models, both prediction models suggest that INN Hotels should contemplate implementing distinct cancellation and refund policies for guests traveling for either business or personal reasons.

  • INN Hotel should implement more/better incentives for corporate guests. Currently only 30% of corporate bookings are from repeat customers. Offering a reward for chosing an INN Hotel could further incentivize the corporate guest to stay with INN Hotel vs a competitor.

  • Repeated guests are very important as they cancel less than other guests. However, currently repeat guests only account for .3% of all guests. Researched should be done to determine how to increase these numbers. Incentives/Loyalty programs should be introduced to increase these percentages.

  • Moreover, if a hotel reaches full capacity or experiences overbooking, management can leverage the model to ensure that rooms remain available for repeat guests or business travelers.

  • By combining predictions from both models, management can identify the most probable scenarios for booking cancellations and allocate those rooms to the least likely cases within the same room category.

    • This should be used as supplemental evidence in support of the managements decision-making process.

What other recommedations would you suggest to the hotel?

  • INN Hotel should implement more/better incentives for online, guests. Currently only .4% of all bookings are from repeated customers. More research should be done on why guests are not chosing to stay more often at an INN hotel. A comparison should be done between INN hotel prices vs their competitors.

  • INN Hotel should implement more/better incentives for offline, guests. Currently only .8% of all bookings are from repeated customers. More research should be done on why guests are not chosing to stay more often at an INN hotel. A comparison should be done between INN hotel prices vs their competitors.

  • The costs associated with true and false positives and negatives should be calculated. If this is done the models can be enhanced to maximize expecited profits and predict expected loses. This would be extremely beneficial to management who attempt to lessen their loses.

  • More research should be done to understand why so many bookings are being canceled. More data and further analysis is needed to determine the cause.

  • Based on our data analysis, a clear seasonal pattern emerges in booking behavior.

    • The most bookings took place in October.
    • The fewest bookings took place in January.
    • Winter had 5,739 bookings. That accounts for 16% of the bookings.
    • Spring had 7,692 bookings. That accounts for 21% of the bookings.
    • Summer had 9,936 bookings. That accounts for 27% of the bookings.
    • Fall had 12,908 bookings. That accounts for 36% of the bookings.

As a result INN Hotels should determine how to boost sells for the Winter months. They can offer appealing deals that attract more customers and result in higher occupancy rates.

This information can also be used to allocate resources based on potential booking rates.

In [231]:
%%shell
jupyter nbconvert --to html //'/content/drive/MyDrive/Python_Course/Project_4/Project_SLC_DSBA_INNHotels_FullCode_Balance.ipynb'
[NbConvertApp] Converting notebook ///content/drive/MyDrive/Python_Course/Project_4/Project_SLC_DSBA_INNHotels_FullCode_Balance.ipynb to html
[NbConvertApp] Writing 10056902 bytes to /content/drive/MyDrive/Python_Course/Project_4/Project_SLC_DSBA_INNHotels_FullCode_Balance.html
Out[231]: